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Abstract 


Computer  simulations  were  performed  that  used  neural  networks  to  synthesize 
filters  for  optical  correlators.  The  synthesized  filters  were  designed  to  maintain 
acceptable  recognition  performance  for  targets  on  cluttered  backgrounds  that  were  rotated 
relative  to  initial  (unsynthesized)  filters.  The  most  significant  results  employed  new 
stretch  and  hammer  neural  networks  which  train  with  guaranteed  upper  bounds  on 
computational  effort  and  generalize  with  guaranteed  lower  bounds  on  smoothness  and 
stability.  These  results  indicate  good  prospects  for  training  neural  networks  to  rapidly 
synthesize  niters  for  a  wide  range  of  target  distortions.  They  also  indicate  possible 
significant  advantages  compared  to  searching  stored  filters. 


1. 


INTRODUCTION 


This  section  briefly  considers  the  approach  and  rationale  for  correlation  filter 
synthesis  using  neural  networks,  key  results  obtained,  and  papers  and  presentations 
produced  during  the  course  of  the  effort. 

1.1  Approach  and  Rationale 

The  use  of  optical  correlation  for  the  recognition  and  location  of  objects  (targets) 
in  noisy  or  cluttered  images  is  well  known  (see,  for  example.  Refs.  4,  5,  and  7  and 
sources  cited  in  these  papers).  Briefly,  in  a  typical  real-time  optical  correlator  an  input 
image  is  loaded  onto  an  electrically  addressed  two-dimensional  spatial  light  modulator 
(SLM),  which  is  an  array  (e.g.,  128  by  128)  of  closely  spaced  active  apertures.  The 
output  from  the  input  SLM  is,  for  example,  an  array  of  binary  coherent  optical  amplitudes 
that  represents  the  input  image.  A  lens  (or  system  of  lenses)  forms  the  Fourier  transform 
of  this  binary  image  at  a  second  or  filter  SLM.  The  filter  SLM  consists  of  an  array  of 
binary  or  ternary  states,  typically  0°  and  180“  phase  shifts  or  these  two  phase  shifts  plus  a 
zero-amplitude  state.  The  filter  SLM  states  are  determined  by  thresholding  the  conjugate 
of  the  Fourier  transform  or  spatial  frequency  pattern  of  a  target.  For  example,  if  a  given 
spatial  frequency  complex  number  lies  below  the  line  of  slope  45°  through  the  origin  in 
the  complex  plane  it  is  represented  by  the  0°  state;  otherwise  it  is  represented  by  the  180° 
state.  Another  lens  or  system  of  lenses  forms  the  Fourier  transform  of  the  product  of  the 
Fourier  transform  of  the  binary  image  and  the  array  of  filter  SLM  states.  If  targets  having 
the  spatial  frequencies  represented  in  the  filter  SLM  are  present  in  the  input  image,  then 
the  final  Fourier  transform  plane  has  bright  spots  called  correlation  peaks  at  the  target 
locations. 

Thus  a  correction  peak  indicates  the  presence  of  a  target  and  specifies  its  location. 
However,  if  the  target  is  rotated  or  scaled  relative  to  the  filter,  or  if  it  undergoes  any 
distortion  other  than  translation,  then  the  correlation  peak  is  degraded  in  amplitude 
relative  to  false  peaks  due  to  clutter  and  noise.  To  address  this  issue  adaptive  correlators 
that  load  updated  filters  into  the  filter  SLM  have  been  designed.  For  example,  if  the 
target  rotates  relative  to  the  initial  filter  the  correlation  peak  decreases,  and  this  change  is 
detected  by  a  video  camera.  New  filters  corresponding  to  different  target  rotations  are 
then  loaded  onto  the  filter  SLM  until  the  correlation  peak  is  restored.  The  success  of  this 
adaptive  feedback  approach  depends  on  the  availability  of  a  bank  of  stored  filters 
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corresponding  to  many  target  rotations,  scales,  and  other  (more  difficult)  distortions  such 
as  those  due  to  aspect  angle  changes  or  partial  occlusions.  It  also  requires  rapid  searching 
of  the  stored  filter  bank  to  find  the  filter  that  restores  correlation  peak  degradation;  search 
times  of  less  than  1/30  sec  (which  correspond  to  standard  video  frame  rates)  may  be 
required. 

Correlation  filter  synthesis  is  an  alternate  and  possibly  more  elegant  approach  to 
avoiding  correlation  peak  degradation  due  to  target  distortions,  particularly  distoi;ions 
other  than  rotation  and  scale  that  may  be  difficult  to  exhaustively  pre-compute.  In  this 
approach  (see  Refs.  4  and  7  and  sources  cited  therein)  gray-level  pixels  from  a  region  of 
the  original  input  image  in  the  neighborhood  of  the  target  location  identified  by  the 
correlation  peak  are  input  to  neural  network  processors  (or  possibly  to  fiizzy-logic  or 
genetic  algorithm  based  systems).  These  processors  are  trained  to  produce  as  output  the 
filter  for  the  current  target  (as  rotated,  scaled,  or  otherwise  non-translationally  distorted) 
that  yields  the  best  possible  correlation  peak.  There  is  typically  one  software-simulated 
neural  network  for  each  Hlter  parameter  to  be  determined,  and  each  network  has  all 
target-neighborhood  pixel  gray  levels  as  its  input.  Since  trained  neural  networks  may  be 
understood  as  "smart"  data  interpolators,  the  stored  filter  and  the  filter  synthesis 
approaches  have  much  in  common:  in  the  former  new  filters  are  found  by  searching  a 
data  bank  consisting  of  the  filters  themselves;  in  the  latter  filters  are  formed  from  a 
distributed  data  baidt  that  contains  neural  network  interaction  strengths  or  weights. 

1.2  Key  Results  and  Outputs 

Excellent  computer  simulation  results  were  obtained  using  neural  networks  to 
synthesize  filters  for  optical  correlators  when  the  targets  (including  targets  on  cluttered 
backgrounds)  were  rotated  relative  to  the  filters.  The  most  significant  results  employed 
new  stretch  and  hammer  neural  networks  which  may  constitute  an  important  and 
enduring  advance  because  they  train  with  guaranteed  upperbounds  on  computational 
effort  and  generalize  with  guaranteed  lower  bounds  on  smoothness  and  stability.  These 
results  indicate  good  prospects  for  training  neural  networks  to  synthesize  filters  for  a 
wide  range  of  target  distortions.  They  also  indicate  possible  significant  advantages 
compared  to  searching  stored  filters. 

The  technical  effort  on  correlation  filter  synthesis  using  neural  networks  was 
successful  and  productive.  It  supported,  wholly  or  in  part,  research  that  produced: 
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•  Four  papers  submitted  to  refereed  journals  (one  published  in  Optics 
Conimunications[l],  one  published  in  Applied  Optics[2],  and  two  under  review 
by  IEEE  Transactions  on  Neural  Networks[3]  and  Neural  Computation[9]). 

•  Two  conference  presentations  (SPDE  Critical  Review,  San  Jose,  November 
1991[4]  and  SPIE  OE/ Aerospace  Sensing,  Orlando,  April  1992[5]). 

•  Two  Electro-optics  Master's  Theses[6,7]  (one  entirely  on  filter  synthesis  using 
neural  networks)  and  an  Invention  Disclosure  [8]  on  stretch  and  hammer  neural 
networks. 

2.  TARGETS,  BACKGROUNDS,  AND  FILTERS 

This  section  discusses  the  targets,  backgrounds,  and  filters  used  in  computer 
simulations  to  investigate  correlation  filters  synthesized  by  neural  networks. 

2.1  Truck  Targets  and  Clutter  Backgrounds 

Figure  la  shows  a  typical  truck  target  and  clutter  background  used  for  training 
neural  networks  for  filter  synthesis,  and  Figure  lb  shows  a  typical  binarized  truck  and 
background  used  for  correlator  input.  Ref.  7  presents  additional  examples  of  targets  and 
backgrounds  and  describes  the  binarization  procedure.  All  targets  and  backgrounds 
originated  from  actual  gray-level  visible  or  infrared  images. 

2.2  BPOF  and  TPAF  Filters 

A  filter  with  two  phase  states  (typically  0°  and  180°)  is  a  binary  phase  only  filter 
(BPOFO,  and  a  filter  with  these  two  states  plus  the  zero-amplitude  state  is  a  ternary  phase 
amplitude  filter  (TPAF).  Figure  Ic  is  a  BPOF  obtained  from  a  binarized  Fourier 
transform  of  the  truck  in  Figure  la  (which  was  defined  to  be  at  0°  rotation).  The 
binarization  was  performed  using  a  threshold  line  angle  (TLA,  the  angle  between  the 
positive  imaginary  axis  and  a  line  through  the  origin)  of  0°,  which  requires  that  complex 
numbers  in  the  positive  half  and  negative  half  of  the  complex  plane  were  represented  by  a 
0°  phase  shifts  and  180°  phase  shifts,  respectively.  Figure  Id  is  a  TPAF  obtained  from 
the  BPOF  by  imposing  a  10  to  60  pixel  radius  bandpass  (i.e.,  all  filter  regions  except  the 
annular  region  between  these  two  radii  were  opaque)  and  by  defining  9  by  9  binary 
superpixels  such  that  each  superpixel  had  the  same  phases  as  the  majority  of  its  interior 
pixels.  It  was  necessary  to  define  and  use  filter  superpixels  to  reduce  the  number  of 
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neural  networics  required  for  Hlter  synthesis,  thus  limiting  the  computational  effort. 

Ref.  7  presents  additional  examples  of  BPOF  and  TPAF  filters. 

3.  STRETCH  AND  HAMMER  NEURAL  NETWORKS 

This  section  discusses  the  radial  basis  fimction  neural  network  that  was  used  to 
obtain  the  most  significant  correlation  filter  synthesis  results. 

3.1  Bounds  on  Computational  Effort,  Smoothness,  and  Stability 

Stretch  and  hammer  neural  networks  successfully  address  two  common  concerns 
in  using  a  neural  network  to  solve  a  practical  problem:  (1)  the  time  required  to  train  the 
neural  network  is  often  excessive,  even  for  a  supercomputer,  and  (2)  the  trained  neural 
network  often  does  not  generalize  effectively  enough  to  solve  the  problem.  For  stretch 
and  hammer  neural  networks  guaranteed  bounds  on  computational  effort  ensure  that  the 
maximum  numerical  precision  and  number  of  computational  steps  required  for  training 
can  be  specified  in  advance  of  training.  In  addition,  guaranteed  bounds  on  smoothness 
and  stability,  which  can  also  be  specified  in  advance  of  training,  ensure  that  each  neural 
network  output  changes  by  no  more  than  a  specified  value  if  the  training  data  are  changed 
by  a  small  amount. 

As  shown  in  Figure  2a,  stretch  and  hammer  neural  networks  are  feedforward 
architectures  that  have  separate  hidden  neuron  layers  for  stretching  and  hammering  in 
accordance  with  an  easily  visualized  physical  mode.  The  mean  Xj  of  the  training  values 
for  each  input  X][,  X2, .-,  x^j  is  subtracted  from  each  input  at  the  input  neurons.  A 
standard  principal  components  transformation  dien  forms  linear  combinations  of  these 
zero-mean  inputs  through  coefficients  (or  neural  network  weights)  ajj^,  where  j,  k  =  1,2, 
...,  n.  The  outputs  uj,  U2, ...,  Ujj  of  the  stretch  neurons  are  therefore  linear  transformations 
of  the  original  inputs  that  "stretch"  these  inputs  to  give  them  equal  "importance".  Each 
hammer  neuron  fi  has  as  input  all  stretch  neuron  outputs  and  forms  an  n-dimensional 
Gaussian  radial  basis  function  of  these  inputs  centered  on  training  point  i  with  standard 
deviation  si,  where  i  =  1,2, ...,  m  and  m  is  the  number  of  training  points.  Each  hammer 
neuron  output  is  multiplied  by  a  coefficient  cj  to  form  an  output  neuron  input.  The  output 
neuron  also  has  as  input  a  bias  bQ  and  a  linear  combination,  through  coefficients  b^,  b2, 

...,  bj},  of  the  stretch  neuron  outputs.  Thus  the  final  output  y  consists  of  a  bias  term  plus  n 
linear  terms  proportional  to  the  principal-component-transformed  inputs  plus  m  nonlinear 
terms  each  proportional  to  an  n-dimensional  Gaussian  function  of  these  inputs. 
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3.2  Training  Procedures  and  Testing  Results 

As  discussed  in  Refs.  2, 3,  S,  7,  and  8,  training  the  stretch  and  hanuner  neural 
network  consists  of  (1)  transforming  the  inputs  to  principal  components  coordinates,  thus 
determining  the  weights  Xj  and  ajj^,  (2)  finding  an  a  priori  hypersurface  such  as  a  least 
squares  hyperplane  through  the  training  points,  thus  determining  bQ,  bj,  b2,  ■■■,  bi,, 

(3)  finding  the  Gaussian  radial  basis  function  standard  deviations,  thus  determining  the 
weights  sj,  and  (4)  Hnding  the  Gaussian  radial  basis  function  coefficients  cj.  The  training 
points  are  interpolated  because  the  number  of  basis  function  coefficients  equals  the 
number  of  training  points.  The  basis  function  standard  deviations  are  chosen  to  be  as 
large  as  possible  consistent  with  maintaining  diagonal  dominance  for  the  simultaneous 
hr**  r  equations  that  must  be  solved  to  obtain  the  basis  function  coefficients.  As  shown 
rigorously  in  Refs.  2, 3,  and  S,  this  choice  insures  that  training  example  generalization  is 
maximally  smooth  and  stable  consistent  with  unique  training  in  a  predeterminable 
number  of  steps. 

Figure  2b  compares  stretch  and  hammer  neural  network  and  natural  cubic  spline 
results  for  one-input  training  examples.  The  curves  are  comparable  except  for  sparse 
training  example  regions,  where  the  stretch  and  hammer  curve  approaches  the  least 
squares  line.  This  behavior  is  desirable:  cubic  spline  curves  are  the  smoothest  possible, 
but  they  typically  exhibit  unrealistic  deviations  from  the  training  examples  for 
extrapolation  and  prediction. 

4.  HLTER  SYNTHESIS  PROCEDURES 

This  section  discusses  the  most  successful  procedures  for  obtaining  inputs  for 
neural  network  correlation  filter  synthesis  and  for  specifying  outputs  that  reduce  the 
number  of  separate  neural  networks  required.  Many  other  (less  successful)  procedures 
are  discussed  in  Ref.  7. 

4.1  Input  and  Output  Speciflcation 

The  most  significant  neural  network  filter  synthesis  results  used  600  separate 
stretch  and  hammer  neural  networks,  each  with  31  inputs.  Each  input  was  the  mean  gray 
level  in  one  of  3 1  radial  wedges  covering  a  9  by  9  pixel  region  centered  on  the  target, 
which  was  located  by  the  correlation  peak  in  a  128  x  128  pixel  input  scene.  The  output  of 
each  neural  network  was  one  of  the  600  binary  9  by  9  superpixels  in  a  TPAF,  where  the 
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TPAF  was  superpixelated  so  that  training  could  be  accomplished  in  a  few  hours  on 
desktop  computers. 

42  Training  Specification 

As  described  in  Ref.  7,  there  were  138  sets  of  training  inputs:  46  with  the  truck 
target  rotated  0®,  2°, ....  90®  on  a  66  (out  of  256)  gray  level  background,  46  with  these 
angles  on  a  142  gray  level  background,  and  46  with  these  angles  on  a  cluttered 
background.  For  each  set  of  training  inputs  the  output  for  each  of  the  600  neural 
networks  was  one  of  the  binary  superpixels  in  the  TPAF  for  the  truck  on  a  blank  (0  gray 
level)  background  rotated  by  the  training  input  angle. 

5.  FILTER  SYNTHESIS  RESULTS 

This  section  discusses  the  most  significant  correlation  peak  to  clutter-ratio  results 
for  neural  network  filters  synthesized  using  the  procedures  discussed  above.  Many  other 
results  (less  significant  in  terms  of  their  practical  potential  for  correlator  systems)  are 
discussed  in  Ref.  7. 

5.1  Correlation  Peak  to  Clutter  Ratio  Plots 

Significant  filter  synthesis  results  from  Ref.  7  using  stretch  and  hammer  neural 
networks  are  shown  in  Figure  3.  Here  correlation  peak  to  clutter  ratio  (defined  as  the 
highest  peak  in  a  5  by  5  pixel  target-centered  grid  divided  by  the  highest  peak  in  the 
remainder  of  the  128  x  128  pixel  region)  is  plotted  versus  in-plane  target  rotation  angle 
for  three  correlation  filters:  the  best  possible  filter  (i.e.,  the  9  by  9  superpixel  TPAF  from 
the  Fourier  transform  of  the  truck  at  the  input  rotation  angle,  where  binarization  of  the 
truck  for  the  correlator  input  is  selected  for  the  best  peak  to  clutter  ratio),  the  fixed  zero 
degree  filter  (i.e.,  the  best  possible  filter  for  the  truck  at  a  fixed  0®  rotation  angle),  and  the 
filter  synthesized  by  600  stretch  and  hammer  neural  networks. 

5.2  Evaluation  of  Synthesized  Filters 

Figure  3a  shows  results  for  a  clutter  background  not  used  in  training,  and  Figure 
3b  shows  results  for  both  a  clutter  background  and  target  rotation  angles  not  used  in 
training  (1®,  3°, ...,  89®).  Note,  that  the  peak  to  clutter  ratio  for  the  fixed  filter  falls  below 
3  dB  after  less  than  3°  of  target  rotation,  whereas  the  neural  network  synthesized  filter 
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remains  above  3  dB  in  Figure  3a  and  remains  above  3  dB  in  Figure  3b  except  at  two  (of 
the  89)  testing  angles. 

As  discussed  in  Ref.  7, 27  additional  graphs  similar  to  those  shown  in  Figure  3 
were  produced  for  different  input  scene  and  target  rotation  angle  sampling  patterns, 
different  clutter  backgrounds,  peak  to  sidelobe  instead  of  peak  to  clutter  ratios,  and 
standard  backpropagation  (see,  for  example,  R.  P.  Lippmann,  "An  Introduction  to 
Computing  with  Neural  Nets,"  IEEE  ASSP  Mag.,  Vol.  4,  pp.  4-22,  1987)  instead  of 
stretch  and  hammer  neural  networks.  One  backpropagation  neural  network  with  the  same 
inputs,  outputs,  and  training  as  the  600  stretch  and  hammer  neural  networks  used  to 
produce  the  results  shown  in  Figure  3  yielded  better  results  (i.e.,  higher  peak  to  clutter 
ratios  for  the  synthesized  filters)  than  the  stretch  and  hammer  neural  networks.  However, 
this  backpropagation  neural  network  (which  had  32  inputs,  6(X)  outputs,  and  200  hidden- 
layer  neurons  with  50-  percent  of  the  interconnections  randomly  removed)  had 
approximately  17  times  more  adjustable  parameters  available  per  output  (although  these 
parameters  were  not  all  independent)  than  the  stretch  and  hammer  neural  network,  and  its 
superior  performance  may  be  attributed  to  this  factor.  Also,  backpropagation  neural 
networks  with  approximately  the  same  number  of  adjustable  parameters  (i.e.,  weights)  as 
the  stretch  and  hammer  neural  networks  (for  which  training  convergence  can  be 
guaranteed)  did  not  converge  in  training.  Finally,  the  one  backpropagation  neural 
network  that  yielded  better  results  required  approximately  40  hours  to  train  on  a  486-class 
33  MHz  desktop  computer,  whereas  the  600  stretch  and  hammer  neural  networks  required 
approximately  three  hours. 

6.  CONCLUSIONS  AND  PROSPECTS 

The  storage  of  only  102,0(X)  (i.e.,  600  times  138  +  31  +  1)  parameters  for  all  600 
stretch  and  hammer  neural  networks  was  shown  to  permit  the  synthesis  of  filters  that 
yielded  peak  to  clutter  ratios  above  3  dB  in  more  than  90  percent  of  the  cases  for  both 
clutter  backgrounds  and  target  rotation  angles  not  used  in  training.  This  generally 
acceptable  performance  is  particularly  significant  in  view  of  the  fact  that  neural  networks 
may  synthesize  suitable  filters  for  target  tracking  (but  not  in  general  for  target  detection, 
since  an  initial  correlation  peak  is  required)  much  faster  than  "smart"  filter  search 
strategies.  The  synthesis  of  more  than  ten  million  filters  per  second  may  be  feasible  if 
hardware  rather  than  software  simulated  neural  networks  are  employed.  Stretch  and 
hammer  and  related  basis  function  neural  networks,  because  of  their  guaranteed  upper 
bounds  on  training  computational  effort  and  their  guaranteed  lower  bounds  on 
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generalization  smoothness  and  stability,  may  be  ideal  for  synthesizing  filters  for  a  wide 
range  of  "difficult"  target  distortions,  including  aspect  angle  and  obscuration  distortions 
for  which  training  data  may  be  limited.  For  these  distortions  the  neural  network  synthesis 
approach  may  have  significant  advantages  compared  to  searching  stored  filters. 
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Figure  1.  (a)  Typical  truck  target  and  clutter  background  used  for  neural  network 

training,  (b)  Typical  binarized  truck  and  background  used  for  correlator 
input.  (c)BPOF  from  0^  TLA  binarized  Fourier  transfonn  of  truck,  (d) 
TPAF  from  BPOF  using  9x9  superpixels  and  10-60  pixel  radius 
bandpass. 
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Hgure  3. 


Cotreiation  peak  to  clutter  ratio  versus  in-plane  target  rotation  angle  for 
three  correlation  filters:  best  possible  filter,  fixed  zero  degree  filter,  and 
filter  synthesized  by  strenh  and  hammer  neural  networks,  where 
synthesized  filter  is  for  (a)  clutter  background  not  used  in  training  and  (b) 
clutter  background  and  rotation  angles  not  used  in  training. 


OPTICS  COMMOHICATIONS,  Vol.  85,  pp.  311-314,  IS  Sep  91 


Optical-resonator-based  neural  network 

Steven  C  Gustafson,  Gordon  R.  Little  and  Darren  M.  Simon 

HtsMKh  ItauMt.  L'lumttty  of  Otricm,  Doyton.  OH  4S469,  USA 
Received  4  June  1991 


A  neuni  network  modti  bnecd  oo  «■  oecicil  iwaoMor  «  deicnbed  «nd  iu  pnnen  ncoiBitioB  perfonMace  i«  loveuiiated  m 
eofflputcf  simutetioas. 


This  paper  describes  a  neural  nettwork  model  in  a  fom  suitable  for  petformint  computer  simulation  ex¬ 
periments  and  assessing  possible  optical  imptementttkws  (1  ].  The  model  is  oonsistem  with  optical  resonator 
designs  that  may  include  dynamic  hotogcams  and  threshold  phase  conjugate  mirron  [2],  and  it  could  be  of 
near-term  value  in  the  development  of  new  pattern  recognition  algorithms. 

In  many  all-optical  computing  architeetures,  dynamic  holograms  are  envisioned  for  interconnection  and  stor¬ 
age  functions  and  nonlinear  components,  sudi  as  arrays  of  bistable  optical  devices  or  thresholded  phase  con¬ 
jugate  mirrors,  are  envisioned  for  decision  operations.  The  necessary  adapution  and  feedback  interactions  be¬ 
tween  the  interconnection  and  decision  components  are  often  achieved  by  incorporating  these  components  in 
linear  or  ring  resonators  (3-25). 

A  simple  and  general  formulation  of  a  neural  network  model  consistent  with  such  optical  resonamr  designs 
may  be  obuined  by  well-known  methods  in  which  plane  wave  amplitudes  and  phases  are  specified  at  discrete 
times  separated  by  the  resonator  period.  In  this  formulation  the  model  inputs  and  outputs  ate  complex-element 
vectors,  and  a  sute  veaor  and  a  hologram  matrix  evolve  in  time  according  to  a  set  of  coupled  nonlinear  dif¬ 
ference  equations  that  represent,  in  general,  a  high-order  threshold  logic  [26].  The  hologram  matrix  is  a  func- 
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Fia  t.  Neuni  network  model  based  ooaa  opiicil  reMoaiot.  The 
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tion  of  the  outer  product  matrix  of  the  evolving  sute  vector  and  has  a  form  that  depends  on  the  hologram  and 
resonator  geometry. 

A  diagram  of  the  model  and  equations  for  the  model  are  given  in  fig.  I .  Note  the  term  in  the  hologram  matrix 
equation  proportional  to  the  outer  product  matrix  of  the  sute  vector  with  either  ( i )  elements  on  each  diagonal 
or  ( ii )  elements  on  the  main  diagonal  replaced  by  their  sum.  This  term  may  be  readily  derived  for  state  vector 
elements  as  plane  waves  with  either  (i)  evenly  spaced  or  (ii)  pairwise  unequally  spaced  propagation  directions, 
respeaively. 

For  example,  consider  three  plane  waves  with  evenly  spaced  propagation  direaions  0-J,  0,  and  B+d  that 
record  a  diffraction  pattern  (i.e..  a  hologram)  having  amplitude  transmittance  proponional  to  the  squared 
magnitude  of  the  sum  of  the  waves.  Using  e**  to  represent  a  plane  wave  with  propagation  direction  9.  consider 
the  reconstruction  of  this  hologram  with  waves  having  the  same  propagation  directions  but  different  complex 
amplitudes.  There  are  then  seven  output  waves  proportional  to  the  terms  of 
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where  are  the  recording  wave  amplitudes,  y„  are  the  reconstructing  wave  amplitudes,  and  T  indicates  trans¬ 
pose.  From  eq.  ( 1 )  the  three  central  output  waves  are  proponional  to  the  terms  of 


p\iflxx*]y. 


(2) 


where  t  is  the  complex  transpose  operator  and  Jt  is  an  operator  that  replaces  each  diagonal  with  the  sum  of 
the  elements  along  that  diagonal 

Some  comments  on  the  model  are:  (i)  The  hologram  matrix  is  self-referenced  in  that  no  separate  reference 
beams  (e.g.,  at  different  angles  for  different  recordings)  are  involved,  (ii)  The  hologram  matrix  could  at  least 
approximately  represent  many  forms  of  diffracting  structures:  thin  or  thick,  amplitude  or  phase,  static  or  dy¬ 
namic,  reflection  or  transmission.  ( iii )  The  nonlinear  operator  performs  no  interconneaion  operations  because 
it  independently  replaces  each  complex  element  of  its  argument  by  another  complex  element  (iv)  The  non¬ 
linear  operator  may  incorporate  gain  or  phase  conjugation  to  compensate  for  wide-angle  scattering  from  the 
hologram.  ( v )  The  nonlinear  operator  could  approximate  many  types  of  components,  including  arrays  of  bist¬ 
able  optical  devices  and  phase  conjugate  mirrors  with  thresholding  and  gain,  (vi)  The  input  and  output  ma¬ 
trices  A  and  B  may  represent  input  and  output  devices  such  as  beam  splitters. 

The  performance  of  the  model  as  a  pattern  recognizer  or  associative  memory  for  the  exclusive-or  function 
was  investigated  in  computer  simulations.  In  this  investigation  r(r)  was  a  vector  of  three  complex  elements, 
-  H(/)  was  a  3x3  matrix  of  complex  elements,  .t'  was  an  operator  that  replaced  each  element  of  its  argument 
vector  by  the  element  squared  and  divided  by  the  resulting  vector  magnitude,  .M  was  an  operator  that  replaced 
each  diagonal  of  its  argument  matrix  with  the  sum  of  diagonal  elements  (as  described  in  the  example  above 
for  equally  spaced  propagation  direaions),  and  A  and  B  were  3x3  identity  matrices. 

The  model  was  trained  on  each  of  the  four  exclusive-or  funaion  patterns,  where  the  orthogonal  complex- 
plane  vectors  (o+io)/v/2  and  ( -o+io)/v/2  represented  1  and  0,  respeaively.  For  training,  the  initial  hol- 
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ogram  matrix  H(0)  for  each  pattern  was  the  3x3  identity  matrix,  the  first  two  elemenu  of  r(0)  were  the  ex- 
clusive>or  function  inputs,  and  the  third  element  of  r(0)  was  the  exclusive-or  function  output.  The  model  was 
allowed  to  iterate  for  each  pattern  until  H(r)  no  lon^r  changed  with  t  so  that  four  training  holograms  were 
generated. 

The  model  was  tested  with  H(0)  set  equal  to  the  sum  of  the  four  training  holograms  and  with  the  third 
element  of  r(0)  set  equal  to  the  complex*plane  vector  i  (which  is  the  unit  magnitude  vector  that  bisects  the 
angle  between  the  vectors  that  represent  I  and  0).  The  model  was  allowed  to  iterate  for  each  of  the  four  ex- 
clusive-or  patterns,  and  for  each  case  the  final  third  element  of  r  was  determined.  For  appropriately  selected 
values  of  a,  0,  and  y.  it  was  found  that  each  of  the  four  final  third  elements  of  r  had  an  an^e  in  the  complex 
plane  that  more  closely  matched  the  angle  representii^  a  1  (45* )  if  this  was  the  correct  exclusive-or  function 
output  or  a  0  ( 135* )  if  this  was  the  correct  output. 

The  mean  number  of  iterations  required  in  the  setting  phase  for  the  correct  convergence  of  the  third  element 
of  r  for  the  combined  four  exdusive-or  cases  was  investigated  as  a  function  of  the  parameters  a,  fi,  and  y.  Con¬ 
vergence  was  defined  to  occur  when  the  third  element  complex-plane  angle  variation  between  iterations  was 
less  than  one  part  per  million.  It  was  found  that  the  mean  number  of  iterations  for  correct  convergence  in¬ 
creased  as  a  linear  function  of  the  logarithm  of  y  over  at  least  the  range  y *0.002  to  y*  100.  Fig.  2  shows  the 
mean  number  of  iterations  for  correct  convergence  versus  a  and  0  for  y*0.1.  Note  that  correct  convergence 
occurs  for  a  wide  range  of  the  model  parameters.  Ftg.  3  shows  the  complex-plane  angle  of  the  final  minus  the 
initial  third  element  of  p  versus  a  for  0mO.S,  y«0. 1,  and  the  four  exdusive-or  function  cases.  This  angle  is  the 
angle  of  the  third  element  of  the  vector  output  of  the  nonlinear  operator .  t  in  flg.  1.  In  flg.  3,  a  1  output  is 
ideally  45* -90*  4-360*  *31 5*  and  a  0  output  is  ideally  135* -90* *45*. 

It  may  be  concluded  that  the  optical-resorutor-based  neural  network  modd  can  successfully  recognize  or 
classify  exdusive-or  patterns  on  which  it  has  been  trained  for  a  wide  range  of  modd  parameters.  This  result 
is  significant  because  exdusive-or  (or  inverse  exdusive-or)  patterns  can  not  be  classing  using  a  linear  modd. 
Thus  a  praaicai  pattern  classification  algorithm  based  on  the  optical  resonator  modd  may  be  feasible.  As¬ 
suming  (hat  suiuUe  optical  materials  and  componenu  become  available,  a  long-term  consequence  could  be 
the  development  of  hardware  neural  network  pattern  recognition  systems  based  on  optical  resonator  designs. 


Fig.  2.  Mean  iwntiont  for  correct  converience  venus  a  ami  fi  for 
ymO.l.  No  convetgence  (concci  or  incorrect)  was  obuincd  in 
the  non-hitched  refion. 


Fig.  3.  Complex-plane  angle  of  the  final  minus  the  initial  third 
demeai  of  e  versus  a  forgwO-t.  ywO.I,  and  the  four  exclusive- 
or  functioa  cases.  The  curves  identified  with  open  sad  closed 
squares  represent  the  cases  (0.0)— (0)  and  (t.l)— (i)  respec¬ 
tively  while  Die  tingle  curve  identified  with  solid  diamonds  lei^ 
resenu  the  two  cases  (0.1  >— (I )  and  (1.0)— (1).  AcceptaUc 
operation,  u  defined  by  a  90*  complex-plane  angle  decision 
boundary,  is  achieved  for  a  <  23. 
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A  boMU-funetion  teehniqu*  for  rteonstructing  imagu  with 
miasing  pixeU  is  <Useribtd.  This  tsehntqus  yiel^  optimal 
rteons^ntetsdimagt  smoothness  in  UuUsaeh  basis- function 
width  is  maxtmi^  consistent  with  an  acceptable  lead  of 
computadoiud  effort, 

words:  Image  reconstruction,  image  restoration. 

The  reconstruction  of  missing  pixels  in  images  is  a  problem 
that  arises  in  many  contexts.  Examples  include  the  uni¬ 
form-grid  resampling  of  Eaith-firom-satellite  imagn  that 
have  undergone  nonlinear  geometric  transformations  to 
remove  motion  effects.^  the  restoration  of  partially 
scured  nonlinear  image  boundaries,^  and  the  reconstruc¬ 
tion  of  arbitrary-view  images  from  seleetad-view  data  in 
tomography.^*  Typical  approaches  to  the  reconstruction 
problem  indude  the  use  of  interpolation  or  approximation 
techniques,  such  as  bilinear  in  tariwiation  or  cu^  B  splines. 
However,  these  techniques  usually  require  that  all  pixels  be 
located  on  a  uniform  grid,  and  they  typically  yield  recon¬ 
structed  pixel  values  that  are  not  consistent  with  known 
image-formation  processes,  such  as  processes  modeled  by 
the  convolution  of  Gaussian  fimctions  with  impulse  func¬ 
tions  at  the  pixel  locations. 

Radial  basis-function  interpolation  and  approximation 
techniques  avoid  these  limitations.^  but  they  typically 
introduce  two  major  concerns:  ( 1)  the  specification  of  the 
extent  or  the  width  of  the  basis  functions  after  their  form 
has  been  selected  consistent  with  known  image-formation 
processes  and  (2)  the  limitation  of  the  level  of  computa¬ 
tional  effort  required  tin  both  predsion  and  numbw  of 
computational  stepsi  to  obtain  the  basis-funccion  coeffi¬ 
cients.  particulariy  if  the  number  of  known  pixels  is  large. 
As  shown  below,  basis-function  techniques  can  be  designed 
to  address  these  concerns:  after  the  form  of  the  basis 
functions  has  been  selected  consistent  with  a  priori  knowl¬ 
edge,  optimal  reconstructed-image  smoothness  is  achieved 
in  that  each  basis-function  width  is  maximized  consistent 
with  an  acceptable  level  of  computational  effort.  Maximiz¬ 
ing  basis-function  widths  may  be  related  to  the  optimal 
selection  of  smoothing  parameters  in  image  restoranon  by 
regularization.* 

A  typical  reconstruction  task  and  an  optimal  (in  the  sense 


indicated  above)  interpolation  approach  are  as  follows. 
An  image  has  known  pixel  values  z,  i gray  level  or  binary)  at 
locations  with  i  «  1.  2, . . . ,  a,  and  unknown  ii.e.. 
miasing)  pixdv^esz,  at  locations  (zt,y,j,  withA  *  n  +  1. 
n  +  2,...,  n  m.  Ideally,  small  clusters  of  unknown 
pixds  are  surrounded  by  lar^  dusters  of  known  noise-free 
pixels  so  that  interpolation  is  appropriate.  The  basis 
functions /ilx,y)  •f,(x  -x„y  -  y„  v,)  are  used  to  fit 

■  etfi(x,y)  +  c,^j(x,y)  -)-  •••-)-  c,/,(x, y)  (1) 

to  the  data  Zi(Zi,y,),  where  for  given  Iz.y)  the  magnitude 
IAfx>y)|  increessa  monotonicaUy  as  the  positive  width 
parameter  u.  increases  and  as  |/i(x,y)|  approaches  zero  as 
either  jx  -  Xi|  or  |y  -y,|  becomes  la^.  If  we  desire,  v, 
miqr  be  the  second  moment  of  |A(z,  y)  |,  v,  may  be  a  constant 
independent  of  t,  and  fi(x,y)  may  equal  firi,  Vi)  with  r,  ■ 
(fr  **yi)*  (y  -yi)*]*'*  so  that  the  bai^  functions  are 
radiaL  With  the  fiinrtonal  form  of  fjCx,  y)  specified,  the  Ui 
are  obtained  by  sdving  the  n  independent  nonlinear  equa¬ 
tion 

Ifiil  -  S  l/“wl  ■  «.  i  -  1. 2 . n,  (2) 

j"t 

where  »  fi(Xj,yj),  j  ■  1, 2, ....  n,  and  c  is  a  positive 
constant  that  spe^es  the  depee  of  diagonal  dominance  of 
the  matrix  F  >  |fy);  thus  c,  as  discussed  below,  limiu  the 
level  of  computational  effort.  The  basia-funmion  coeffi¬ 
cients  Ci  are  then  determined  by  solving  the  n  simultaneous 
linear  equations  in  A  unknowns: 

2y»Ci/u  +  c*/z>''’‘ ■ ’■‘•Cnf.vt  J*1.2 . n.  (3) 

Finally,  the  unknown  pixel  values  z,  are  determined  by 
substituting  the  pixel  locations  (z,,  y,)  for  lx,  y)  in  Eq.  ( 1). 

Since  the  number  of  pixels  nuiy  be  large,  an  assessment  of 
the  computational  effort  involv^  in  solving  Eqs.  12)  and  (3) 
is  required.  First,  note  that  Eq.  (2)  spedto  a  indepen¬ 
dent  nonlinear  equations  in  one  positive  variable,  and  thus 
each  of  these  equations  may  be  solved  separately  by  n«ing 
standard  methods.  Second,  note  that  Eq.  (3)  spedfies  a 
linear  system,  and  thus  the  computational  effort  required 
to  obtain  a  solution  depends  on  a  condition  number  of  the 
matrix  F.  The  two-norm  condition  number  kj.  which 
equals  the  square  root  of  the  ratio  of  the  largest  to  the 
smallest  eigenvalue  of  the  product  of  and  F,  typically 
controls  the  required  numoical  precision  and  the  number 
of  computational  steps  independent  of  the  algorithm  <  itera¬ 
tive  or  direct  with  iterative  improvement)  used  to  obtain  a 
solution.*  This  condition  number  may  be  limited  to  an 
acceptably  small  value  by  spedfrring  a  suffiicientty  large 
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viluc  for  c  in  Eq.  (2).  A  resuli  ubtainod  by  Vanh*  who 
uaod  tundard  manix  norm  notation,  is  i  1/c. 

Combining  this  rosult  with  the  expreaaiona^  Kt  « 
il_F  11,11  II,.  IIFII.  •  mu,  Ij  l/J.  and  IIBII,  ^ 
vn  II  Btl.  for  any  n  x  n  matrix  B.  and  noting  from  Eq.  (2) 
that  the  second  expression  equals  2d  ~  c,  where  d  •  max 
i/i<  It  «•  may  conclude  that  K,  £  n(2d  -  c)/c.  Thus  «  may 
be  used  to'  limit  k,.  which,  as  indicated  above,  typically 
controls  the  numerical  predawn  and  the  number  of  compu¬ 
tational  steps  required  to  solve  for  the  coefficienu  ci, 
c,. . . . .  c,.  Significantly,  the  degree  of  diagonal  domi¬ 
nance  of  F  may  be  adiuMd  to  yield  optimal  interpolation 
surface  or  reconstructad-unage  smoothneu  in  that  eadi 
baaia-funetion  width  parameter  Vi  ia  maximised  eonaiatant 
with  an  acceptable  le^  of  computational  effort. 

In  the  same  sense  that  thin-plate  spline  interpolation  baa 
a  bending  model  in  which  an  elastic  plane  ia  deformed  into 
contaa  with  the  data  pointa,  the  ba^fiinetion  interpola¬ 
tion  technique  described  above  has  a  hammering  model  in 
which  a  midleable  plane  is  similarly  daformad.  In  the 
hammering  model  the  locatioiu  of  the  kiwwn  ami  unknown 
pixels  are  marked  on  a  malleable  plane  surface.  Using  a 
large  number  of  small  strikes  to  smoothly  deform  the 
surface,  we  direct  the  hammering  at  the  known  pixel 
locations.  For  hanunering  at  each  known  pbtel  location 
the  number  of  strikes  per  unit  area,  whidi  ia  proportional 
to  /‘.(x.y).  depends  on  the  distance  firom  the  kium  pixel 
location.  The  dependence  is  such  that  the  number  of 
strikes  per  unit  area  at  eadi  known  pixel  location  is  leas  (by 
an  amount  proportional  to  <)  than  the  sum  of  the  number  of 
strikes  per  unit  area  at  all  other  kxwwn  pixel  locations  (thus 
ensuring  the  diagonal  dominanca  of  F ).  Subject  to  this 
dependence,  the  number  of  hammering  strikes  is  a^iusted 
so  that  each  known  pixel  location  on  the  malleable  plane 
surface  is  deformed  normal  to  the  plane  by  an  amount 
proportional  to  its  gray  level. 

Several  comments  on  the  hammering  restoration  tech¬ 
nique  follow.  First,  as  indicated  above,  this  tedmique  is 
applicable  when  the  known  pixels  are  not  located  on  a 
regular  grid,  in  which  case  conventional  techniques  such  as 
cubic  ^lines  typically  cannot  be  used.  Second,  althou^ 
Gaussian  radial  basis  functions  fiir,,  v,)  exp(-r,^/2vi>  are 

consistent  with  many  image  formation  processes,  wavelet 
basis  functions*  mqr  also  be  appropriate.  Third,  although 
well-known  techniques  such  as  two-dimenaionai  polyno¬ 
mial  interpolation  have  been  shown  to  exhibit  singularities 
for  nonuniform  sampling,**  it  has  been  proved  that  for  a 
wide  class  of  radial  basis  functions  the  matrix  F  is  non¬ 
singular  without  the  constraints  of  Eq.  (2)  and  regardless  of 
the  value  of  V*.**  However,  nonsingularity  does  not  ensure 
acceptably  small  matrix  condition  numbers:  evenforprov- 
ably  nontingular  matrices  the  level  of  computational  effort 
required  to  obtain  basis-function  coefficients  is  typically 
unfeasible  for  sufficiently  large  matrices.**-**  Four^  the 
hammering  interpolation  technique  has  a  neural-network 
interpretation  in  which  the  inputs  j  and  y  are  connected  to 


neurons  that  impitnut  the  baais  functions,  and  the  neu¬ 
ron  outputs  pass  through  weighted  connections  to  an 
aoeumuliuort^  forms  the  ouqmtr.  Training  the  neural 
network  involves  fiwding  the  connection  weighu  and  the 
basis-function  width  parameters,  and  finding  these  param¬ 
eters  requires  multivariable  nonlinear  optimization  for 
typical  basis-funetkmnaural  networks.  However,  the  tech¬ 
nique  described  above  raquirea  only  univaziate  nonlinear 
optimization  to  obtain  the  maximum  basis-function  width 
parameters  consistent  with  an  acceptable  level  of  computa¬ 
tional  efibrt. 

This  work  was  supported  in  part  by  the  U.  S.  Air  Force 
Rome  Laboratory,  &  Miami  Valley  Research  Institute, 
and  the  Martin  Marietta  Corporation. 
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Abstract— GenenUiiation  is  considered  in  the  context  of  data  interpolation, 
extrapolation,  and  approximation.  For  interpolation  using  certain  functional  forms  it  is 
shown  that  upper  bounds  may  be  placed  on  required  numerical  precision  and  number  of 
computational  steps,  and  it  is  also  shown  that  lower  bounds  may  be  placed  on  certain 
measures  of  interpolation  smoothness  and  stability.  These  results  are  obtained  for  radial 
or  other  basis  fitnction  interpolation  using  a  diagonal  dtminance  criterion  for  the  matrix 
whose  inverse  determines  the  basis  function  confidents.  The  diagonal  dominance 
criterion  is  particularly  appropriate  for  triplications  where  extrapolated  values  must 
asymptotically  approach  an  a  priori  function,  and  this  criterion  also  provides  a 
justifiable  solution  to  the  problem  of  selecting  basis  function  parameters. 

Generalizadon  realized  as  data  interpolation,  extrapolation,  or  approximation  may 
be  performed  with  upper  bounds  on  computational  effort  and  lower  bounds  on 
smoothness  and  stability.  Such  generalization  can  be  carried  out  for  certain  definitions  of 
the  bounded  quantities  and,  in  particular,  for  interpolation  using  certain  functional  forms, 
including  forms  that  asymptotically  approach  an  a  priori  function,  as  required  for 
applications  such  as  image  reconstmction. 

As  an  example,  consider  data  that  consists  of  m  input-ouq>ut  points  (xj,  yj),  where 
the  inputs  Xj  are  length  n  vectors,  the  ouq)uts  yj  are  scalars,  and  i,  j  »  1, 2, ...,  m.  These 
data  points  are  to  be  interpolated  and  extrapolated  using  the  function 
f(x) »  exp[-(x  -  Xj)2/Oj^  obtained  by  convolving  Cj5(x  -  xp,  which  is  an  in^ulse 
function  centered  on  the  jth  data  input  vector,  with  exp(-x2/20j2),  which  is  a  Gaussian 
radial  basis  function  with  standard  deviation  Oj,  and  summing  the  results  for  all  j.  For 
interpolation  the  coefficients  cj  must  be  such  that  yi  •  f(xp,  and  thus  they  are  obtained  by 
solving  m  simultaneous  linear  equations  in  m  unknowns  yp  XjCjajj,  where  ay  » 
expKxj  -  xp2/20j2]. 
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The  parameters  Oj  may  be  selected  such  that  the  matrix  A  »  {ajj}  is  diagonally 
dominant  by  a  positive  amount  e  for  each  column,  Le.,  so  that  e  «  ajj  <  for  all  j. 
Using  standard  matrix  norm  notation,  it  is  well  known  that  tite  oo.ncnn  of  A  is  ||A||oo  == 
maxjljiajjl  andtiiatfor  anynxn  matrix B  the <»>norm and  the2-ncffmare  related  by 
liB||2  ^  Vn  ||B||e«.  Also,  it  has  been  shown  that  the  oo-norm  of  A*^  is  IIA'ML  ^  1/e 
[Varah.  1975].  Using  ay  «  1,  die  above  expressions  indicate  that  the  2-norm  condition 
number  of  A  is  <2  ■  llAll2llA’lt52  ^  n(2  -  e)^ 

The  computational  effort  defined  by  the  numerical  precision  and  the  number  of 
iterative-isqnovement  ccni^utational  steps  required  to  obtain  the  cj  has  upper  bounds 
that  increase  with  K2.  Suppose  that  <2  is  the  maximum  acceptable  2-norm  condition 
number  for  A.  Then  smoothness  defined  by  (XjOj^An)^  is  by  selecting  the 

Oj  such  that  n(2  -  e)/e  a  K2'.  This  selectimt  requires  the  solution  of  m  independent 
nonlinear  equations  (one  for  each  unknown  Oj),  but  the  iterative-improvement  solution  of 
the  m  simultaneous  linear  equations  yi  a  2|jC|a|j  asymtotically  dominates  the  required 
number  of  computatirmal  steps  as  m  becomes  large.  Finally,  stability  defined  by  I/Tj.  is 
bounded  by  1^^  ^  [(K2'ry)‘^  - 11/2  for  K2’ry  <  1,  where  r^  =  [SiCcj'  -  is  the 

fractional  root-mean-square  coefficient  change,  ry  » [SiCyi  -  is  the 

fractional  root-mean-square  data  output  change,  and  the  q  change  to  q'  if  the  y^  change 
to  y(  [Golub  and  Van  Loan,  1989]. 

For  this  example,  generalization  is  thus  achieved  with  least  upper  bounds  on 
computational  effort  and  greatest  lower  bounds  on  smoothness  and  stability  if  the  aj  are 
selected  such  that  A  has  diagonal  dominance  e  s  2n/(n  +  ic’2).  This  example  may  be 
modified  to  address  approximation,  multiple  outputs,  norms  other  than  the  2-norm,  other 
definitions  of  computational  effort,  smoothness,  and  stability,  and  non-Gaussian  or  non- 
radial  basis  functions  provided  that  such  functions  decrease  as  any  of  their  independent 
variables  increase. 

In  the  same  sense  that  thin  plate  spline  interpolation  has  a  "bending"  model  in 
which  an  elastic  plane  is  deformed  into  contaa  with  the  data  points,  the  above  example 
has  a  "hammering"  model  in  which  a  malleable  plane  is  smodlariy  deformed.  In  the 
hammering  model  numerous  small  strikes  ate  directed  at  each  Xj  with  Gaussian  precision 
exp[-(x  -  Xi)^/20j^]  such  that  the  plane  is  smoothly  deformed  into  contact  with  the  data 
points.  The  hammering  standard  deviations  Oj  are  selected  such  that  the  ratio  of  the 
strike  density  at  xj  to  the  sum  of  strike  densities  at  xj^  (Le.,  the  ratio  of  hits  to  misses)  is 
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at  least  *  (n  -f  K*2)/(2n).  Here  K2'  bounds  (1)  the  numerical  predsicm  and  number  of 
iterative-in^novement  ctm^utational  steps  required  to  determine  the  relative  number  of 
strikes  at  each  X},  (2)  the  stability  with  respect  to  change  in  yj  of  this  determination  as 
measured  by  I/Tc,  and  (3)  the  smoodmess  0$  tte  hammered  surface  as  measured  by  the 
root-mean-square  of  the  Oj. 

The  diagonal  donunance  criterion  permits  die  accqitanoe  of  larger  condidmi 
numbers  for  A  than  criteria  [Narcowich  and  Ward,  1991]  that  qiply  to  other  classes  of 
basis  fumnions,  e.g..  radia]  basis  funcdtms  diat  increase  with  radius.  By  stxne  criteria 
such  radial  functions  interpolate  more  smoothly  than  basis  functions  that  decrease  with 
radius,  but  the  latter  functions  are  apprc^iriate  for  problems  in  which  extrapolated  values 
must  {pptoach  an  a  priori  function  widi  ftcm  dtt  inputs,  as  may  be  required 

for  image  lecmistruction  applications  [Gustafmn  et  aL,  1992].  Also,  in  applications  for 
which  the  process  diat  produced  the  data  is  unknown,  geneializatitm  that  smoothes  the 
data  by  convolution  with  a  Gaussian  or  similar  (tecreasing-with-radius  function  may  be 
justified.  Hnally,  the  diagtnal  dominance  criterion  provides  a  justifiable  solution  to  the 
problem  of  basis  function  parameter  selection  [Poggio  and  Girosi,  1990],  le.,  width 
parameter  such  as  Oj  are  maximised  such  diat  die  largest  acceptable  condition  number  is 
not  exceeded. 

1.  G.  R  Golub  and  C  F.  Van  Loan,  Matrix  Confutations^  2nd  ed.,  Jdins  Hopldns 
Univ.  Press,  1989. 

2.  S.  C  Gustafson,  G.  R.  Litde,  J.  S.  Loomis,  and  T.  S.  Puterbaugh,  "Optimal 
Reconstruction  of  Missing  Pixel  Images,”  submined  to  Applied  Optics,  Jan.  1992. 

3.  F.  J.  Narcowich  and  J.  D.  Ward,  "Norms  of  Inverses  and  Ccmdition  Numbers  for 
Matrices  Associated  with  Scattered  Data,"  /.  Approximation  Theory,  Vol.  64,  pp. 
69-94,  Jan.  1991. 

4.  T.  Poggio  and  F.  Girosi,  "Networks  for  Approximation  and  Learning,"  Proc. 
IEEE,  Vol.  78,  pp.  1481-1497,  Sep.  1990. 

5.  J.  M.  Varah,  "A  Lower  Bound  fcv  the  Smallest  Singular  Value  of  a  Matrix," 
Linear  Algebra  and  Its  ^plications,  VoL  1 1,  pp.  3-5,  Jan- 1975. 


SPIE  Critical  Review  CR-40-02,  San  Jose,  CA,  4  Nov  91 


ADAPTIVE  OPTICAL  CORRELATION  USING 
NEURAL  NETWORK  APPROACHES 

David  L.  Flannery  and  Steven  C  Gustafson 
Research  Institute,  University  of  Dayton 
Dayton,  Ohio  4S469>0140 

Abstract 

This  paper  reviews  wtvk  on  binary  phase-only  (BPOF)  and  ternary  phase-amplitude 
CTPAJP)  correlation  and  highlights  recent  investigations  of  neural  networic  ^rproaches 
for  augmenting  correlation-based  hybrid  (t^rtical/electronic)  automatic  target 
recognition  systems.  The  theory  and  implemmitation  of  BPOF  and  TPAF  ctMrelation 
using  available  qntial  light  mo^lators  is  reviewed,  including  recent  advances  in  smart 
TPAF  formulations.  Results  showing  the  {vomise  of  neural  networks  for  enhancing 
correlation  system  operation  in  the  areas  of  estimating  distortion  parameters,  adapting 
filters,  and  improving  discrimination  are  presented  and  discussed. 

1.  INTRODUCTION 

Coherent  optical  correlation  and  artificial  neural  network  appro^hes  have 
independently  shown  promise  for  pattern  recognition,  including  automatic  target 
recognition  (ATR),  which  is  an  important  militaty  application  area.  This  paper  reviews 
the  concepts  and  progress  of  investigations  that  combine  these  two  approaches  and  that 
are  motivated  by  the  potenti^  foi  retaining  the  strengths  while  bypassing  the 
weaknesses  of  each  approach. 

A  leading  candidate  architecture  for  ATR  using  optical  correlation  is  the  hybrid 
adaptive  correlator  (HAQ  concept  depicted  in  Figure  1.  The  HAC  consists  of  a  rapid 
sequential  (i.e.,  "real-time")  correlation  module  imbedded  in  an  overall  electronic 
system  that  controls  the  cycle  of  operation,  including  the  selection  of  appnqiriate 
correlation  filters  Grom  a  large  bank  of  pre-computed  "smart"  (i.e.,  combining  both 
distortion-invariance  and  clutter  discrimination)  filters. 

For  any  practical  application  scenario  the  number  of  filters  is  large,  e.g.,  1,000  or 
more.  Thus  the  problem  of  selecting  the  best  filter  subset  from  the  bank  at  a  particular 
time  is  critical.  This  problem  is  not  yet  resolved  even  though  current  and  projected 
device  technology  supports  conelatitHi  rates  of  100  - 1000  input-filter  pair^sec.  The 
"filter  strategy"  control  problem  limits  practical  designs  because  of  the  implied 
workload  for  the  electronic  control  system. 

A  salient  advantage  of  the  correlation  approach  is  its  inherent  shift-invariant 
response:  the  target  need  not  be  centered  in  the  correlator  input,  and  the  location  of  the 
correlation  peak  in  the  ouqmt  plane  provides  a  location  estimate  for  the  target  in  the 
input  plane. 
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Neural  network  approaches  clearly  have  the  cqtability  to  perfcnn  the  ATR 
function,  but  their  practical  ^licatioo  is  complexity-limited.  From  an  information 
theory  perspective,  the  Ain  problem  can  be  viewed  as  a  very  comidex  input-output 
tdadonship  that  must  be  learned  with  sufficient  accuracy  to  provide  statistically 
acceptable  target-nonuuget  discrimination.  Statistical  p^ocmance  is  emphasi^  here 
because  of  the  robust  nature  of  realistic  ATR  scenes-a  training  set  can  at  best  be  only 
statistically  representative  of  the  world  of  aU  possible  input  scenes.  So  &r  this  problem 
description  sounds  well  suited  fm  neural  network  approaches  and,  in  principle,  these 
approaches  are  ideal.  However,  ATR  performance  most  be  invariant  to  target  shifts, 
and  when  the  additional  complexity  associated  with  this  invariance  is  added  to  that 
associated  with  robust  sets  of  input  scenes,  the  resulting  overall  comfdexity  exceeds 
levels  that  can  be  practically  handled  with  current  or  near-term  compuuuional 
resources,  i.e.,  neural  netw^  training  time  is  unaccqttably  Itmg.  and  the  size  of  the 
network  is  too  large  either  for  simiUation  or  for  practical  embodiment  in  a  real-time 
system. 

This  paper  concentrates  on  neural  network  annoaches  that  augment  the  HAC 
concept  Three  areas  are  considered: 

1.  Estimating  target  distortions  relative  to  reference  views. 

2.  Synthesizing  new  filters  to  follow  dynamic  target  distortions. 

3.  Estimating  confidence  levels  associated  with  correlation  peaks  to  improve  target- 

nontarget  discrimination. 

Work  performed  in  all  three  areas  is  reviewed  in  the  context  of  BPOF  and  TRAP 
correlation  within  the  HAC  concqM.  Two  types  of  neural  networks  have  been  used  in 
this  wmk,  "standard"  baclqiropagation  and  "new"  locally  linear  and  stretch  and  hammer 
architectures  developed  at  the  University  of  Dayton.  Btickground  on  BPOF  and  TPAF 
correlation  is  presented  below  and  is  followed  by  a  review  of  pertinent  neural  network 
theory.  Investigations  in  the  three  areas  listed  above  are  then  discussed  in  uirn. 

2.  CORRELATION  WITH  BPOF  AND  TPAF  FILTERS 

A  recent  review  of  optical  correlation  techniques  was  provided  by  Flannery  and 
Homer  [1989  (a)].  A  renewed  surge  of  interest  in  optical  correlation  for  practi^ 
applications  has  been  spurred  by  the  recent  development  of  the  phase-only  filter  (POF) 
concept  [Homer,1984],  rapidly  followed  by  the  development  of  discrete-modula^- 
level  BPOF  and  TPAF  filters  that  suppmt  effective  real-time  implementation  with 
currently  available  spatial  light  modulators  (SLM)  [Ross,  1983;  Psaltis,  1984;  Flannery, 
1986].  These  developments  have  made  the  HAC  (Rgure  1)  a  practical  concept  capable 
of  implementation  using  current  technology. 

The  BPOF  is  defined  with  two  phase  modulation  levels  (usually  0  and  180  degrees, 
corresponding  to  ampliuides  of  -1  and  1).  It  can  be  implemented  with  magneto-optic 
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SLMs  (MOSLM  [Ross,  1983;  Davis.1989}}.  feno-dectric  Ikpiid  aysial  (PLQ  SLMs 
[Johnson,  1990],  snd  ddonwible  minor  device  (DMD)  SLMs  [Florence,  1990]. 

The  TPAF  may  be  viewed  as  an  important  extension  of  the  BPOF  that  includes  one 
additional  modulatioo  level:  zero  (ix.,  the  signal  light  is  blocked  at  the  filter  posidtms 
or  pixels  having  this  level).  In  practice  even  the  BPOF  has  zero-modulation  levels  due 
to  its  limited  region  of  support  (e.g.,finiiB  SLM  aperture).  In  the  TPAF,  elements  are 
set  to  zero  anywhere  in  the  filter  region  according  to  an  algorithm  that  provides 
(primarily)  the  benefit  of  improved  target-nonmrgetdiscriminatioo.  This  inq»ovement 
is  achieved  by  blocking  spatial  frequencies  where  nontargets  are  expected  to  have 
relatively  large  spectral  cemtent  [Flannery  ,1988  (a)]. 

The  TPAF  may  be  implemented  in  MOSLM  devices  using  appropriate  drive 
techniques  to  access  a  third  "mixed-magnetization*  state  [Ka8t,19^].  Theoretical  and 
experimental  results  amply  demonstrate  the  improved  discriminatioo  (relative  to  BPOF 
correlation)  {Movided  by  the  TPAF  [LindeIL1990;Flannery,1990*,Flaimery,1991]. 

Recent  reports  indicate  good  potential  for  imidementing  TPAF  modulation  in  mher 
SLM  devices  [Juday,  1991].  Another  approach  is  to  cascade  a  binary  amplitude  SLM 
with  a  phase-modulating  SLM,  which  is  less  attractive  from  an  optk^  engineering 
perspective. . 

All  the  limited-modulation  filters  discussed  above  (POF,  BPOF,  and  TPAF) 
provide,  in  generaL  excellent  corxelmion  performance  as  characterized  by  sharper 
correlation  peaks,  greater  peak  intensity  (correlation  efficiency),  and  improved 
nootarget  discrimiruuion  compared  to  the  classic  matched  filter  (which  uses  fiill 
complex  amplitude  modulation). 

The  signal-to-noise  performance  of  limited-modulation  filters  has  been  a  subject  of 
great  interest  since  intuition  suggests  that  a  price  must  be  paid  for  restricting 
modulation  leveb.  The  classic  matched  filter  by  definition  provides  the  best  SNR 
(signal-to-noise  ratio)  for  the  case  of  additive  Gbussian  noise.  However,  the  limited- 
modulation  filters  have  shown  generally  siqrerior  discrimination  against  actual  scene 
clutter  [Homer,  1990].  Analysis  is  complktoed  by  the  lack  of  accqrted  standardized 
analytical  models  of  practice  clutter.  Thus  SNR  or  discrimination  performance  is 
scene-  dependent,  and  many  results  are  mtber  anecdotal  or  suspect  because  of  the 
limited  robustness  of  lest  sets.  Several  theoretical  treatments  (limited  by  the 
assumption  of  Gaussian  white  noise)  hare  been  reported  [Diclwy,  1988, 1989; 
Kuinar,1989,]  but  are  not  reviewed  here.  However,  limits-modulation  level  filters 
exhibit  lypic^  SNR  reductions  of  3  to  10  dB  (decibels)  relative  to  matched  filters  for 
the  white  noise  case  but  frequently  provide  better  discrimination  against  practical  noise 
patterns.  Worth  noting  is  a  recent  analytical  treatment  that  derives  tight  bounds  for 
SNR  degradations  for  various  limited-modulation  filters  [Fam,1990].  A  summary  of 
the  current  situation  is  that  the  presumed  SNR  penalties  for  BPOF  and  TPAF 
correlation  are  far  outweighed  by  their  advantages,  which  include  tire  overwhelming 
advantage  of  practical  implementation  with  available  SLM  devices.  An  additional 
practical  advantage  is  the  reduced  amount  of  storage  required  for  BPOF  and  TPAF 
filters  (e.g..  one  or  two  binary  bits  per  pixel)  relative  to  complex-valued  filters  (e.g.,  at 
least  8  bits  per  pixel). 


Effective  (ectmiques  for  formulttiiig  smart  BPOP  and  TPAP  filters  have  been 
developed  and  demonstrated  in  both  simolatioos  and  experiments.  A  mqjor  design 
component  of  tliese  filtea  is  the  qMdficatiao  of  the  zero-state  pattero  that  optimizes 
SNR  and/or  other  metrics,  including  peak  intensity  (or  corrdaiioo  efficiency) 
{PIannety,l990,199I;Kuinar.l991].  Distortioa  invariance  is  addressed  by  adjusting  the 
training-set  vveights  of  a  composite  in-claas  (target)  image  [teed,1989(a);  Flannery 
1990,1991]  or  by  direct  iteration  of  the  phase  values  of  a  POP  pattern  [Kallinan,1987]. 
The  threshold  line  angle  CTLA)  (Flannery.1988  (b);Fam,1988]  is  another  design 
element  of  BPOP  and  TPAP  formulations.  It  determines  the  relative  weighting  of  odd 
and  even  symmetry  componems  of  the  target  image  in  the  filter  reqxmse.  Small  but 
significant  improvements  result  from  choosing  an  (^rdmum  ILA  value. 

very  recent  and  important  development  in  TPAP  optimization  is  fotmulatioa 
algorithms  that  address  mixed-metric  optimizations  [Kumsn,  1991;  Flannery,  1991], 

Le.,  compromises  that  simultaneously  address  two  or  mote  correlation  performance 
metrics.  These  algorithms  are  motivated  by  the  drservation  that  filters  optimized  with 
regard  rmly  to  SNR  frequently  have  ondesi^le  performance  characteristics  in  other 
respects,  including  correlation  efficiency  so  low  as  to  preclude  practical  implementation 
and  broad  correlation  peaks  that  hamper  location-estimate  post-processing.  These 
undesirable  characteristics  are  associated  whh  very  small  support  regions  that  result 
from  optimizing  only  for  SNR,  i.e..  most  of  the  filler  area  is  set  to  the  zero-modulation 
state. 

Resuiu  are  presented  here  from  a  recent  case  sUKly  of  TPAP  sman  filter 
formulation  using  mixed  metrics  [Flamiery,1991]  to  illustrate  the  state  of  the  art  in  this 
area.  Figure  2  provides  an  example  input  scene  of  a  tank  on  a  busby  background  and 
the  binary  version  of  the  scene  as  used  in  the  study  (to  matdi  the  binary  input  ciqiability 
of  the  experimental  correlator  that  uses  MOSLM  devices  at  both  input  and  filter 
planes).  Smart  TPAP  fillers  were  formulated  to  cover  20  degrees  of  in-plane  target 
rotation  and  12%  of  in-plane  target  scale  variation  while  rejecting  the  background 
shown  as  well  as  three  other  backgrounds  of  diverse  character.  When  the  filter  was 
optimized  only  for  SNR  fix.,  background  discrimination)  it  provided  excellem 
discrimination  (averaging  8.73  dB  over  the  four  backgrounds),  but  the  correlation 
efficiency  was  so  low  that  practical  implementatmn  in  the  experimental  correlator  was 
marginal.  With  optimization  to  a  mixed  metric  reflecting  equal  emphasis  on  both  SNR 
and  peak  intensity,  the  filter  still  provided  excellent  background  discriminatimi 
(averaging  7.62  dB)  and  also  provided  over  twice  the  peak  intensity  of  the  inevious 
filter,  which  was  a  practical  level  for  the  experimental  correlator.  Information 
generated  during  filter  design  indicated  that  in  this  case  the  compromise  filter  provided 
83%  of  the  SNR  of  the  best-SNR  design  and  83%  of  the  peak  intensity  of  the  filter 
designed  for  maximum  peak  response.  Figure  3  shows  simulated  and  experimental 
correlation  intensity  plots  for  the  compromise  filler  with  the  input  of  Figure  2(b). 
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3.  NEURAL  NETWORK  ARCHITECTURES 

This  section  reviews  neural  network  arehitectures  and  training  algorithms  used  in 
the  work  discussed  here.  Badcpropagatioii  (ej..  Liiqxnann.  1987).  locally  linear 
[Gustafson.  1991bl.  and  stretch  and  hammer  [Gustaiion,  1991a]  neural  networks  are 
considered. 

Backpropagadon-trained  neural  networic 

Feedforward  neural  networks  with  weights  determined  by  the  backpiopagation 
training  are  the  most  commonly  used  neural  networks  in  engineering  applications.  It 
has  been  shown  in  princ^le  that  with  three  neuron  layers  and  with  enough  neurons  in 
the  hidden  layer  (the  layer  not  directly  connected  to  either  inputs  or  ouqnits)  any 
arbitrary  input-output  relationship  may  be  learned.  The  backpropagatitm  algorithm  is 
equivalent  to  a  gradient  descent  t^rtimization  in  weight  space  with  a  least-square-error 
g^.  However,  no  general  prescription  for  the  various  design  elements  is  available; 
and  judicious  initial  choices  and  trial-and-error  design  iterations  are  the  normal  methods 
for  designing  networks  for  particular  plications.  Accuracy  of  learning,  interpolation, 
and  extrapolation  relative  to  training  set  data  are  important  performance  parameters. 
These  parameters  are  affected  by  training  set  selection,  training  schedule,  RMS  error 
goal,  and  network  ttqrography.  A  large  body  of  common  experience  relating  these 
elements  has  evolved  atul  the  results  report^  here  were  obtained  using  conventional 
methods  based  on  this  experience. 

Locally  linear  neural  network: 

As  noted  above,  multilayer  feedforward  neural  networks  that  use  backpropagadon 
or  similar  training  algorithms  are  by  far  the  most  common  in  engineering  plications. 
However,  these  neural  networits  have  several  limitations.  First,  they  generaUy  lack  the 
coordinate  invariance  property,  accmding  to  which  the  testing  output  is  unchanged  if 
the  testing  and  training  (or  data)  inputs  are  translated,  rotated,  or  scaled.  Second, 
without  excessive  training  they  generally  lack  the  data  interpolation  property,  accmding 
to  which  the  testing  ouqnit  is  the  training  oupt  if  the  testing  inputs  are  the  training 
inputs  [Poggio  and  Giiosi,  1990].  Third,  they  generally  lack  the  linear  rqnesentation 
property,  according  to  which  any  testing  point  is  on  a  linear  surface  if  all  training  points 
are  on  this  surface.  Hnally,  they  generally  lack  the  data  bootstrapping  prpr^, 
according  to  which  the  testing  ouq)uis  have  least  squared  error  for  any  training  points 
that  are  transformed  into  testing  points.  In  contrast,  locally  linear  neural  netwr^  have 
an  of  the  above  desirable  characteristics.  There  are  two  steps  in  training:  transforming 
the  inputs  to  invariant  coordinates  and  finding  a  plane  through  each  training  point  that 
satisfies  a  bootstrapping  property  in  that  the  plane  predicts,  with  minimum  squared 
error,  a  specified  number  of  training  point  nearest  neighbors.  In  testing  the  plane 
through  the  training  point  nearest  to  the  testing  point  (in  the  transformed  inputs)  is  used 
to  find  the  testing  point  output  Locally  limar  neural  networks  extend  nearest  neighbor 
techniques  because  in  training  they  find  a  plane  (having  as  many  dimensions  as  there 
are  inputs)  through  each  training  point  that  ui  general  has  non-zero  slope. 
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Stretch  and  hammer  neural  network: 

Stretch  and  hammer  neural  networks  also  avoid  the  many  Umitatkms  of 
baclqaopagatloo  training  and  use  radial  basis  fiinctioo  methods  to  achieve  advantages 
in  generalizing  training  examples.  These  advantages  include  (1)  exact  learning.  (2) 
maximally  smooth  modding  of  Gaussian  deviadons  frarn  linear  relatiaiidiips,  (3) 
identical  ouqwts  for  arbitrary  linear  combinatkm  of  inputs,  and  (4)  training  without 
adjustable  parameters  in  a  ptedeterminable  number  of  st^  Stre^  and  hammer 
neural  networks  are  feedforward  architectures  that  have  separate  hiddea  neuron  layers 
for  stretching  and  hammering  in  accordance  with  an  easily  visualized  physmal  model. 
Training  consists  of  (1)  transforming  the  inputs  to  princ^  component  coordinates,  (2) 
finding  the  least  squares  hyperplane  through  the  training  points,  (3)  finding  the 
Gaussian  radial  bads  functkm  variances  at  the  column  diagonal  dominance  limit,  and 
(4)  finding  the  Gaussian  radial  basis  function  coefficieats.  The  Gaussian  radial  basis 
function  variances  are  chosen  to  be  as  large  as  possible  consistent  with  maintaining 
diagonal  dominance  for  the  simultaneous  linear  equations  that  must  be  solved  to  obtain 
the  basis  function  coefficients.  This  choice  insures  that  training  example  generalization 
is  maximally  smooth  consisteot  with  unique  training  in  a  predeterminable  number  of 
steps. 

4.  DISTORTION  PARAMETER  ESTIMATION  USING  NEURAL  NETWORKS 

The  problem  addressed  by  the  w«k  reported  in  this  section  is  as  follows.  Givena 
starting  condition  consisting  of  correlation  of  an  irqmt  target  object  with  a  matching 
filler,  estimate  disiottioo  parameterfs)  (e.g..  rotation  angle  and  scale  factor)  as  the  input 
object  is  distorted  from  the  original  view.  The  value  of  distortion  parameter  estimates 
for  use  in  a  filter  contnd  strata  for  a  HAC  is  obvious. 

The  tqiproach  investigated  for  distordon  parameter  estimation  in  one  study 
(Gusiafson,1990]  involved  sampling  the  shape  of  the  cotrdation  peak  (the  desired  peak 
in  response  to  the  target)  to  derive  inputs  for  abackpropagation«trained  neural  network. 
In-plat«  target  rotation  was  used  as  the  distortion  mechanism  and  thus  the  goal  of  the 
network  was  to  estimate  the  rotation  angle  relative  to  the  initial  view.  A  binary  image 
of  an  aircraft  was  used  as  the  largm,  and  128x128  sample  cmrelation  simulations  were 
performed.  Intensity  samples  were  taken  on  a  5x5  pixel  grid  centered  cm  the  correlmion 
peak.  These  sampler  ^  vre  normalized  to  the  central  value,  thus  defining  24  inputs  to 
the  neural  netwtnk.  'Hie  network  was  trained  for  5>d^tee  intervals  of  target  rotation 
with  the  filler  held  constant  at  the  initial  view.  The  neural  network  estimated  in>plane 
rotation  to  within  •t-A5  degrees  over  a  range  of  -40  vo  *40  degrees  from  the  initial  view. 

Another  study  [Gustafs(m.l991]  applied  the  same  approach  to  an  IR  truck  target 
image  and  used  both  backpropagaiion  and  locally  linear  neural  networks.  The 
backpropagation  network  estimated  roiatitm  angle  with  errors  less  than  2  degrees, 
whereas  the  LLN  provided  errors  less  than  0.35  degrees  (see  Figures  4  and  5). 
Estimation  was  seriously  degraded  when  realistic  backgrounds  were  introdw^  in  the 
input  scenes,  presumably  due  to  the  resulting  disuinion  of  the  correlation  peak  relative 


to  the  zero-background  case.  Training  using  background-disuxted  inputs  did  not  reduce 
this  degradation  to  an  acceptable  level. 

To  reduce  the  degtadadoo  caused  by  backgrounds,  a  new  samiriing  approach, 
input-plane  sampling,  was  introduced,  bi  this  approach  (successfid)  conelaiioa  is  used 
to  esublish  a  reference  point  (location  estimate)  for  sampling  the  target  intensity  in  the 
input  plane.  Sampling  was  again  based  on  a  SxS-pixel  gM,  but  in  general  the  sampling 
pattern  was  chosen  to  keq>  most  or  all  samples  on  the  target  image  (thus  excluding 
backgrmind  effects).  A  study  using  the  same  truck  target  showed  that  rotation  angle 
could  be  predicted  over  the  full  rotation  range  oi  360  degrees  with  errors  well  below 
two  degrees  even  in  the  presence  of  clutter  backgrounds  (see  Figure  6)  [Olczak,1991] 

Although  perhaps  not  obvious,  a  boot-strapping  assumption  is  implicit  in  the  above 
distortion-estimation  approaches:  good  correlation  must  be  maintained  at  all  times. 

The  correctly-located  correlation  peak  must  be  detectable  (i.e.,  must  exhibit  a 
sufficiently  high  peak-to-clutter  ratio  in  the  correlation  plane)  to  enable'proper  location 
of  the  sampling  grid  for  either  ^iproadi  discussed  above.  The  input-plane  sampling 
approach  is  generally  superior  because  it  provides  samples  that  are  less  corrupted  in 
cases  where  there  is  a  slight  mis-nutch  between  filter  and  target  but  where  g^ 
correlation  is  still  obtained. 

Distortion  parameter  estimatioo  using  neural  netwtxk  approaches  shows  promise 
for  aiding  HAC  filter  control  strata  in  tracking  modes  of  operation  (where  successful 
ccurelation  has  been  achieved  and  must  be  maintained  throu^  ra^  target  evolution). 
Current  research  is  investigating  issues  such  as  sensidviiy  to  sampling  grid  locadon 
accuracy,  optimization  of  sampling  grid  patterns,  and  extension  to  multiple  distortion 
dimensions. 

5.  FILTER  SYNTHESIS  USING  NEURAL  NETWORK  APPROACHES 

This  section  reviews  efitvts  that  investigate  the  modification  or  synthesis  of  TPAF 
filters  using  neural  networit  approaches. 

Hlier  amplitude  states: 

In  an  initial  simulation  study  the  bandpass  of  a  BPOF  was  covered  by  four  binary 
(on-ofO  amplitude  control  rings  driven  by  four  neural  network  outputs  [Flannery,1989 
W].  The  inputs  were  the  integrated  power  spectral  densities  in  the  input  scene  taken 
over  the  same  four  spectral  rings  (readily  available  on  a  real-time  basis  in  an  optical 
correlator  system).  The  goal  was  to  maximize  correlation  signal-to-noise  for  a  target 
imbedded  in  different  noise  samples  by  c^timum  control  of  the  correlator  bandpass. 

This  control  may  be  viewed  as  setting  zero-states  in  a  TPAF  on  a  very  coarse 
framework.  The  simulations  were  successful  in  that  the  neural  network  was  easily 
trained  to  provide  near-optimum  bandpass  configurations  for  a  variety  of  input  noise 
conditions. 
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PUier  phase  stales: 


Two  basic  issues  must  be  Rsidved  to  define  a  geaetal  filler  synthesis  approach: 

(1)  A  filter  rqitesentation  or  parameterization  must  be  used  that  is  consisiem  with  the 
number  of  available  outputs  in  current  neural  network  prKtice  (which  is  limited  in  both 
simulations  and  hardware  impiementmioos)  and  (^  the  inputs  to  the  network  must  be 
specified. 

To  illustrate  the  first  issue,  consider  the  coQ^Jete^wcifkation  of  a  128xl28-pael 
TTAF.  In  excess  of  32.000  binary  values  must  be  determined,  whereas  the  neural 
network  computation  resources  typically  used  sq^xn  at  most  600  ouqwts.  Thus  ways 
to  reduce  the  number  of  parameters  used  to  represm  a  filter  mast  be  developed.  One 
approach  is  to  impose  a  bandpass  limit  on  the  filler  (one  quarter  of  the  full  Nyquist 
bandwidth)  and  to  consider  oidy  BPOF  filters  with  cosine  (even)  symmetry.  Another 
aiqnoach  is  to  extend  the  bandpass  to  almost  the  full  Nyquist  inter^  by  grouinng 
pixels  in  3x3  super-pixels  (called  "nonapixds*  hereafter),  each  controlled  by  a  single 
modulation  value.  Ihe  first  annoach  severely  smooths  the  filter  impulse  response, 
whereas  the  second  approach  limits  the  extent  of  the  impulse  response  to  about  one 
third  of  the  input  field  of  view. 

The  second  issue,  defining  network  trqmts.  may  be  addressed  with  the  same  two 
sampling  approaches  already  discussed  in  the  context  of  distortion  parameter  estimation 
(correlation  peak  shape  and  input-plane  sampling. 

Steps  taken  during  the  course  of  investigations  on  filter  synthesis  at  the  University 
of  Dayton  are  summarized  here.  These  investigatkms  paralleled  the  distortion- 
esdmation  investigations  in  the  sense  that  inpot-plane  sanqtling  was  ftwnd  to  be  much 
superior  to  sampling  correlation  peak.  The  quarter-Nyquist  bandpass  filters  were 
successfully  synthesized  but  were  of  littte  practical  value  due  to  their  limited  bandpass; 
they  exhibited  insufficient  discrimination  against  background  clutter  and  non-target 
vehicles  (see  Figure  7).  Nonqtixel  filters  were  also  successftilly  synthesized  and 
performed  well  [OIcz^  1991].  For  nonapixel  filters  the  target  extent  was  less  than  40 
pixels  on  a  128-pixel  format,  thus  satisfying  the  constraim  mentkmed  above. 

Recent  previously  unrepotted  results  on  the  synthesis  of  nonapixel  BPOFs  using 
input-plane  sampling  are  shown  in  Hgures  8, 9  and  10.  Rgure  8(a)  is  a  typical  input 
image  showing  the  target  truck  and  another  vehicle  superimposed  on  a  clutter 
background.  Figure  8(b)  is  a  nonapixel  filter  pattern  for  the  truck.  Figures  8(c)  is  a 
correlation  intensity  pattern  fiom  a  correlation  simulation  using  this  input  and  filter. 
Figure  9  and  10  provides  plots  of  filter  peak-uxlutter  performance  over  360  decrees  of 
urget  rotation  for  synthe^zed  nonapixel  filters  using  backpropagation  and  stretch  and 
hammer  neural  networics,  respectively.  Data  for  two  other  filters  also  are  plotted  for 
refermce  in  Rgure  9;  the  simple  (single-view)  BPOF  and  the  best  possible  BPOF  that 
can  be  designed  at  each  rotation  angle.  As  is  apparent  in  the  plots,  the  performance  of 
the  neural-synthesized  fillets  approaches  that  of  the  best-possible  filters  to  a  satisfying 
degree.  These  plots  involve  the  target  superimposed  on  one  of  several  clutter 
backgrounds;  results  with  the  other  backgrounds  were  similar. 
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The  mulu  seem  impressive  because  the  neural  network  provides  filter  information 
equivatent  to  about  4S  distinct  single«vkw  filters,  which  would  require  aboitt  3  J  kB 
(kUobytBs)(tf  digital  Stonge  (assuming  abom  600  binatybtti/filtBr).  Neural  network 
filler  synthesis  also  providea  an  ifflididt  coonol  mechanism  because  no  indexing  or 
searching  through  a  filler  dainbase  is  required.  The  only  caveat  is  the  same  one 
discussed  in  the  review  of  distortion-estimation  techniques,  which  is  that  this  approach 
is  baaed  on  maintaining  good  conelaticn  in  a  dynamic  scenario  0 A.  successfully 
tracking  the  target)  so  that  the  location  estimate  derived  fiom  the  correlation  pe^  may 
be  used  to  accurately  center  the  input-plane  sampling  grid. 

An  irnponam  question  concerns  the  affect  of  target  distortions  not  kamed  by  the 
network,  e.g..  scale  changes.  More  generally,  how  can  neural  network  filter  synthesis 
be  extended  to  distortions  involving  two  or  more  d^rees  of  fieedom.  e.g..  azimuth, 
elevation,  scale,  etc.?  The  aasociamd  complexiqr  growth  may  tend  to  drive  the  problem 
outside  bounds  practically  addressaUe  by  availal^  neural  nttwtnk  resources.  Thisand 
other  issues  similar  to  those  mentioned  for  thstortion  estimation  are  under  investigation, 
as  arc  techniques  for  neural  network  synthesis  of  the  zero-modulation  pattern  required 
to  form  a  full  TPAF. 

6.  CORRELATION  CONFIDENCE-LEVEL  ESTIMATION  USING  NEURAL 

NETWORK  APPROACHES 

A  correlation  involving  a  target  surrounded  by  chttier  will  result  in  a  peak 
corresponding  to  the  target  and  other  (hopefully  smaller)  peaks  in  lesponae  to  clutter. 
Normally  the  desired  peak  must  exceed  other  peaks  by  some  margin  (e.g.,  3  dB)  for 
correlation  to  be  useful  If  the  cluner  level  hi  the  hqmt  scene  is  gradually  increased,  a 
point  is  reached  where  the  filler  is  no  longer  useful  by  this  standard.  If  the  filter  is  a 
distortion-invariant  smart  filter,  it  mi^  be  possiUe  to  substitute  a  more  target-qrecific 
filter  which  would  furnish  better  discrimination.  This  approach  is  undesirable  in 
general  because  it  implies  a  large  storage  bnk  of ’more-qiecific”  fOien  and  because 
there  is  an  implied  contrtd  problem  fix.,  which  of  the  many  roore-qrecified  filteis 
conreqKmding  to  a  single  smart  filter  should  be  substituted?).  Neunl  network 
techniques  that  estimate  confidence  levels  for  correlation  peaks  are  potential 
approaches  to  this  problem.  Note  that  if  the  original  filter  was  already  ctf  the  more- 
specific  variety,  some  augmentation  oS  the  correlatioo  process  is  marr^uxy  if  useful 
results  are  to  be  obtained. 

The  same  two  sampling  techniques,  correlation  peak  and  input-plane,  were 
investigated  as  neural  networit  inputs,  and  again  input-plane  sampling  proved  superior. 
Recent  previously  unreported  results  are  synopsized  hm  to  illustrate  the  pottntiid  of 
this  anaoach. 

A  set  of  terrain  board  images  was  provided  by  Martin  Marietta.  Strategic  Systems, 
Denver,  Colorado.  These  128x 128-pixel  images  included  three  vehicle  targets  on  a 
cluttered  background.  The  image  set  spanned  elevations  of  IS  to  4S  degrees  and 
azimuths  of  0  to  90  degrees.  A  matrix  of  45  images  covering  this  two-dimensional 
distortion  range  was  used  for  this  study,  including  30  for  training  and  IS  for  testing.  All 
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conelatuxu  woe  perfonned  with  «  sta^  BPOP  constnicied  for  die  cmtral  view  in  the 
itnagematiu.  Tim  conelntioiis  were  poor  in  leqxme  to  die  target  for  many  of  the 
input  images.  For  each  condaikm  four  comlatioa  peaks  were  considered:  one  for  each 
of  the  three  vehicles  Gndwling  the  target  vehicle)  and  the  h^hest  peak  in  response  to 
input  clutter. 

The  input-plane  sampling  mask  consiged  <rf  eight  spokes  of  even  angular 
distribution,  ea^  16  pix^  long.  These  128  saoqiles  were  augmented  by  8  inputs 
derived  by  applying  a  one-dimensional  Roberts  edge*locatian  operator  along  each 
^ke  and  finding  the  location  of  the  strongest  reqxmse.  Hie  networic  was  trained  to 
provide  two  coniidemeataiy-logic  ouqmaindicating  whether  the  sam|ded  object  was  a 
targetornoL  These  oniputs  were  algdm^yctMnbined  to  yield  a  confideaceleveL 
Figure  11  is  a  histogram  plot  of  the  results  of  the  trained  network  applied  to  the  IS- 
image  (60*peak)  test  set  The  a-prioti  target-to-nontarget  ratio  for  these  iqmts  is  1:3. 

As  can  be  seen,  the  network  provided  excellent  aqisraiioo  of  targett  and  nontargets 
0^..  very  useful  confidence  level  outputs).  Figure  12  shows  correlation  intensity  plots 
corresponding  lo  perfect  target-filter  matdi  (pan  a)  and  extreme  target-filter  mismatch 
(psrtb).  Other  cotielatioos  are  expected  to  fidl  between  these  extreme^ 

Cunem  work  is  addressing  more  challenging  images,  variatioosof  the  input-plane 
sampling  mask,  and  the  application  rtf  different  Qrpes  of  neural  network  architectures  to 
this  problem. 


7.  CONCLUSION 

The  work  reviewed  here  has  shown  definite  promise  for  the  developmem  rtf  neural 
networic  approaches  that  augment  hybrid  adaptive  optical  correlatioo  systems. 

Althou^  other  qiproaches  may  be  defined  for  the  use  of  neural  netwote  in  automatic 
target  recognition,  the  qiproaches  discussed  here  involve  an  advantageous  combinatioo 
of  the  strengths  of  the  two  underlying  ledmologies.  b  particular,  these  approaches 
allow  the  two  basic  strengths  of  optical  correlation  (shape-depenrlemdiactimination  and 
mirinsic  location  estimation)  to  be  used  to  their  full  exteoL  Neural  network 
augmentation  techniques,  when  bcoiporated  with  the  HAC  concept,  should  permit  the 
development  (tf  more  efficient  and  pcrwerful  systems  for  addressmg  complex  pattern 
recognition. 
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Fig.  1.  Hybrid  adaptive  cmrelator  (HAQ  concept 
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Fig.  4.  Rotation  prediction  using  backpropagation  neural  network  with  ctvrelatimi  peak 
inputs. 
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Fig.  S.  Rotation  prediction  using  locally  linear  neural  network  with  correlation  peak 
inputs. 
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Fig.  6.  Rotation  prediction  using  backprc^gation  neural  network  with  input  image 
inputs. 
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Fig.  7.  Peak  height  performance  of  every-pixel  filter  synthesized  by  backpropagation 
neural  network. 


Fig.  8.  Typical  input  image  (a),  corresponding  neural-nctwork-synihesized  nonapixel 
filter  (b),  and  resulting  correlation  intensity  (c). 
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Fig.  9.  Peak-to-clutter  petfonnance  of  nont4>ixel  filter  synthesized  by  backpropagation 
neural  netwoik. 
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Abstract 

Stretch  and  hammer  neural  networks  use  radial  basis  function  methods  to  achieve  advantages  in 
generalizing  training  examples.  These  advantages  include  (1)  exact  learning,  (2)  maximally  smooth 
modeling  of  Gaussian  deviations  from  linear  relationships,  (3)  identical  outputs  for  arbitrary  linear 
combination  of  inputs,  and  (4)  training  without  adjustable  parameters  in  a  predeterminable  number  of 
steps.  Stretch  and  hammer  neural  networks  are  feedforward  architectures  that  have  separate  hidden 
neuron  layers  for  stretching  and  hammering  in  accordance  with  an  easily  visualized  physical  model. 
Training  consists  of  (1)  transforming  the  inputs  to  principal  component  coordinates,  (2)  finding  the  least 
squares  hyperplane  through  the  training  points,  (3)  finding  the  Gaussian  radial  basis  function  variances 
at  the  column  diagonal  dominance  limit,  and  (4)  finding  the  Gaussian  radial  basis  function  coefficients. 
The  Gaussian  radial  basis  function  variances  are  chosen  to  be  as  large  as  possible  consistent  with 
maintaining  diagonal  dominance  for  the  simultaneous  linear  equations  that  must  be  solved  to  obtain  the 
basis  function  coefficients.  This  choice  insures  that  training  example  generalization  is  maximally 
smooth  consistent  with  unique  training  in  a  predeterminable  number  of  steps.  Stretch  and  hammer 
neural  networks  have  been  used  successfully  in  several  practical  applications. 

1.  Physical  Model 

In  the  same  sense  that  thin  plate  spline  interpolation  has  a  "bending"  model  in  which  an  elastic  plane  is 
deformed  into  contact  with  the  data  points,  stretch  and  hammer  neural  networks  [Gustafson  et  al.,  1991, 
1992]  have  a  physical  model  in  which  the  data  input  plane  is  similarly  deformed.  In  this  model  the 
input  plane  is  first  stretched  along  orthogonal  coordinates  located  in  the  plane  so  that  the  data  inputs 
(relative  to  their  means)  have  equal  variances  and  zero  covariances.  (This  procedure  is  a  principal 
components  transformation  on  the  data  inputs).  Next  a  least  squares  hyperplane  is  found  for  the 
transformed  data.  Finally,  the  data  inputs  in  the  stretched  coordinates  are  projected  onto  the  hyperplane 
and  hammered  into  contact  with  the  data  outputs  using  numerous  small  strikes  so  that  the  hyperplane  is 
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smoothly  deformed.  Hammering  has  (typically)  Gaussian  precision  with  variance  such  that  the  ratio  of 
strike  density  at  each  data  input  to  the  sum  of  strike  densities  at  all  other  inputs,  i.e.,  the  ratio  of  hits  to 
misses,  exceeds  unity. 


2.  Advantages 

It  has  been  proved  that  if  the  number  of  training  points  is  much  greater  than  the  number  of  inputs,  the 
stretch  and  hammer  neural  network  places  guaranteed  upper  limits  on  required  numerical  precision  and 
number  of  computational  steps  and  that  it  also  places  guaranteed  lower  limits  on  certain  measures  of 
interpolation  smoothness  and  stability  [Gustafson  et  al.  1992].  Here  smoothness  as  measured  by  the 
S'  '^est  hammering  standard  deviation  is  maximized  by  selecting  the  largest  value  consistent  with  an 
able  level  of  computational  effon,  and  this  value  is  obtained  by  setting  the  minimum  ratio  of  hits 
to  misses  equal  to  (k  +  l)/(ic  -  1),  where  k  is  the  largest  acceptable  2-norm  condition  number  of  the 
matrix  F  whose  inverse  determines  the  basis  function  coefficients.  Also,  stability  as  measured  by  the 
reciprocal  of  the  root  mean  square  fractional  change  in  the  number  of  strikes  at  each  data  point  is 
bounded  by  si[(Kr)‘*  -  11/2,  where  r  is  the  fractional  root  mean  square  change  in  the  data  outputs  (i.e., 
good  stability  implies  that  small  data  output  changes  yield  small  interpolation  function  changes). 

Since  the  stretch  and  hammer  neural  network  is  an  interpolator  and  extrapolator  (although  modifications 
to  enforce  additional  smoothness  at  the  expense  of  exact  data  fitting  are  possible),  exact  learning  is 
achieved.  Also,  since  the  hammering  precision  of  radial  basis  functions  is  typically  Gaussian  with  the 
maximum  practical  standard  deviation,  the  network  provides  maximally  smooth  modeling  of  Gaussian 
deviations  from  linear  relationships.  Furihennore,  since  the  data  inputs  are  transformed  by  stretching  to 
principal  component  coordinates,  the  network  provides  identical  outputs  for  arbitrary  linear 
combinations  of  inputs.  Finally,  training  is  achieved  without  adjustable  parameters  with  a 
computational  effort  governed  by  k  in  terms  of  bounds  on  required  numerical  precision  and  number  of 
computational  steps. 

3.  Training  Procedure 

Typical  stretch  and  hammer  neural  network  training  consists  of  standard  operations  that  yield  the 
mathematical  specification  outlined  in  Figure  I  and  detailed  in  the  references  [Gustafson  et  al.  1991, 
1992J.  First  the  data  inputs  Xj  and  their  means  x,  for  all  data  points  are  expressed  in  principal 

component  coordinates  u.  where  a.^  are  the  linear  transformation  coefficients  (see  Figure  1  for 
notation).  Next  a  least  squares  hyperplane  is  fitted  to  the  data  points,  where  b„  is  the  hyperplane 
intercept  and  b,,  bj,  ...,  b„  arc  the  hyperplane  slopes.  Next  the  Gaussian  radial  basis  function  standard 
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deviations  S;  are  selected  to  be  as  large  as  possible  consistent  with  maintaining  acceptable  diagonal 
dominance  for  F  whose  elements  are  given  by  the  radial  Gaussian  functions  f,  evaluated  at  the  stretched 
data  inputs.  Thus,  each  Sj  is  selected  such  that  (l  +  dj)/(l-dj)=  k.  where  dj  is  the  sum  of  the  off- 
diagonal  elements  of  the  jth  column  of  F.  Finally,  the  basis  function  coefficients  c^  are  obtained  by 
solving  m  simultaneous  linear  equations  in  m  unknowns. 

4.  Example  Results 

Figures  2, 3,  and  4  show  example  results  for  the  stretch  and  hammer  neural  network.  Figure  2  compares 
stretch  and  hammer  and  natural  cubic  spline  interpolation  for  one  input  Note  that  the  curves  are 
comparable  except  for  sparse  interpoladon  regions,  where  the  stretch  and  hammer  curve  approaches  the 
least  squares  line.  Figure  3  shows  a  least  squares  plane  fitted  and  hammered  to  four  two-input  training 
points  with  the  two  stretch  (principal  component)  coordinates  indicated,  and  Figure  4  shows  these 
training  points  in  the  stretched  coordinates  with  a  least  square  plane  fitted  and  hammered  in  these 
coordinates.  Note  that  interpolation  in  the  stretched  coordinates  is  smoother  than  in  the  original 
coordinates. 

5.  Practical  Application 

Figures  5  and  6  show  practical  application  of  the  stretch  and  hammer  neural  network  to  an  adaptive 
optical  correlation  system  designed  to  track  targets  in  images  [Flannery  and  Gustafson,  1991].  The 
network  synthesized  binary  phase  Fourier  plane  filters  using  31  samples  of  the  target  region  in  the  input 
scene.  The  network  was  trained  to  synthesize  filters  that  maintain  a  high  correlation  peak  to  clutter  ratio 
for  clutter  backgrounds  not  used  in  training  (Figure  5)  and  for  such  backgrounds  plus  target  rotation 
angles  not  used  in  training  (Figure  6).  Note  the  favorable  comparison  with  the  zero  degree  filter  (a 
fixed  filter  designed  for  zero  degree  rotation)  and  the  best  expected  filter  (the  best  filter  that  could  have 
been  synthesized). 
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Figure  1,  Specification  of  a  trained  stretch  and  hammer  neural  network  using 
Gaussian  radial  basis  functions. 
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Figure  2.  Comparison  of  stretch  and  hammer  and  natural  cubic  spline  interpolation 
for  one«input  training  examples. 
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Figure  3.  A  least  squares  plane  fitted  and  hanunered  to  four  two-input  training 
examples  with  the  two  stretched  (principal  component)  coordinates 
indicated. 
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figure  5. 


Correlation  peak  to  clutter  ratio  versus  in-plane  target  rotation  angle  for 
three  correlation  filters,  where  a  clutter  background  not  used  in  training  was 
employed  to  test  the  filter  synthesized  by  the  stretch  and  hammer  neural 
network. 
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'  Figure  6.  As  in  Figure  S  except  that  both  a  clutter  background  and  target  rotation 

angles  not  used  in  training  were  employed  for  testing. 
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ABSTRACT 


OPTICAL  FILTER  SYNTHESIS  USING  ARTIFCIAL  NEURAL 
NETWOREB 

Manzardo,  Mark,  Angnat 
Univeraity  of  Dayton,  1992 

The  feasibility  of  using  neural  networks  to  synthesize  filters  for  a  hybrid 
adaptive  correlator  (HAC)  was  demonstrated.  Loput  scene  binarization,  filter 
generation,  filter  encoding,  input  scene  sampling,  and  target  rotation  were 
considered  in  developing  neural  networks  for  target  tracking.  Neural  networks 
were  trained  using  target-plus-background  image  samples  as  inputs  and  coded 
filter  values  as  outputs  After  learning,  input  image  samples  not  induded  in 
training  were  used  to  test  the  nemral  networks.  The  resulting  coded  filter 
values  were  evaluated  using  computer-simulated  optical  correlation  with 
ternary  phase  amplitude  filters  (TPAF).  Back-propagation  and  stretch  and 
hammer  neural  networks  successfully  synthesized  filters  for  optical  correlation, 
and  performance  was  adequate  for  tracking  rotated  targets  on  various 
backgrounds.  I^cal  correlation  peak-to-dutter  ratios  were  3  to  9  dB  for  in¬ 
plane  target  rotation  angles  of  0  to  90  degrees. 
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1.  INTRODUCTION 


1.1.  Hybrid  adaptive  correlator  technology 

Pattern  recognition  by  optical  correlation  is  accomplished  by 
intentionally  modifying  the  spatial  frequent  spectrum  of  an  image,  and  thus 
it  is  a  subset  of  Fourier  optical  signal  processing.  The  VanderLugt  correlator 
introduced  modem  optical  signal  processing  concepts  nranderLugt,  1963].  The 
correlation  theorem  in  Fourier  analysis  states  that  the  correlation  of  functions 
f]  and  4  is 

C(x,y)  =  F*{F[f,(x,y)]  Ftl^(x,y)]) 

where  F>  is  the  Fourier  transform  operator 
*•  represents  the  complex  coigugate 

represents  the  invm^e  Fourier  transform  operator. 

By  using  the  properties  of  lenses  and  coherent  li^t,  the  correlation  function 

can  be  produced  at  the  Fourier  transform  plane  of  the  second  lens  illustrated 

in  Figure  1.1.  For  pattern  recognition  f^  is  an  input  image  and  F,  is  the 

coidugate  Fourier  transform  of  the  target  being  searched  for  in  the  input  scene. 

In  general  Fj,  the  correlator  filter,  is  complex  valued.  The  process  of  using  the 

actual  amplitude  and  coiqugate  phase  of  a  target  as  described  above  is  called 

matched  filter  or  VanderLugt  correlation.  Using  a  matched  filter  requires 

holographic  recording  [Goodman,  1968]  which  is  not  practical  for  real-time 

pattern  recognition.  However,  magneto-optic  spatial  light  modulators 
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(MOSUilB)  can  be  used  for  real>time  pattern  recognition  [Ross,  1983].  Tbese 
devices  are  capable  of  modulating  incident  using  three  states  [Kast, 
1989],  e.g.,  full  amplitude  with  0  degree  phase  shift,  full  amplitude  with  180 
degree  phase  shift,  and  zero  amplitude. 

Recent  work  has  shown  that  in  the  area  of  image  analysis  phase 
information  is  more  important  than  amplitude  information  [Oppenheim,  1981]. 
Computer  correlation  simulations  using  only  phase  infiirmation  led  to  the 
phase-only  filter  (POP)  concept  [Homer,  1984].  A  POP  is  advantageous  because 
it  does  not  attenuate  the  amplitude  of  the  optical  beam.  Phase  only  filtering 
can  produce  correlation  peaks  many  times  more  intense  than  simple  matched- 
filtering.  Another  ^ipical  advantage  of  phase  only  filtering  is  localization  of  the 
correlation  peak.  A  POP  often  yields  a  sharp  peak  whereas  a  matched  filter 
yields  a  wide  peak  OFlannery,  1989]. 

Modem  real-time  SLiM  devices  do  not  allow  the  implementation  of 
complete  phase  modulation.  A  subset  of  the  POP,  namely  the  binary  phase-only 
filter  (BPOP),  can  be  used  with  modem  devices  [Plannery,  1988].  A  BPOP  filter 
requires  only  1  bit  for  filter  storage  per  pixel,  whereas  a  continuous  POP 
requires  4  or  more  bits.  BPOPs  are  designed  using  a  threshold  line  angle  (TLA) 
parameter  [Plannery,  1988]  described  in  Section  4. 

A  matched  filter  by  definition  performs  best  using  signal-to-noise  ratio 
as  a  metric  for  the  case  of  additive  Gausian  noise.  However,  POPs  and  BPOPs 
have  performed  better  than  matched  filters  when  used  with  real-world 
backgrounds  [Pielding,  1990].  The  use  of  BPOP  techniques  is  motivated  by 


possible  immediate  implementation  using  available  electronically-addressed 
spatial  li^t  modulators  (SLMs)  in  the  hybrid  adaptive  correlator  (HAC) 
system.  BPOFs  for  optical  correlation  have  been  successfully  implemented 
[Psaltis,  1984].  The  HAC  system  uses  electronics  and  optics  to  preform  pattern 
recognition  and  is  illustrated  in  Figure  2.1.  The  operation  of  this  system  is 
further  discussed  in  Section  2. 

The  scene  must  be  binarized  to  be  implemented  in  a  HAC  system  using 
MOSLMs.  Binarization  is  accomplished  using  algorithms  which  edge-enhance 
and  then  binarize  the  output  by  thresholding  pixel  values.  Various  binarized 
scenes  can  be  created  by  varying  the  threshold,  including  scenes  that  correlate 
well  with  a  particular  filter.  However,  correlation  performance  is  usually 
compromised  for  other  backgrounds.  A  discussion  of  techniques  for  binarizing 
input  scenes  is  presented  in  Section  3. 

The  filters  considered  in  Uiis  research  are  128  by  128  ternary  phase 
amplitude  filters  (TPAFs)  consisting  of  a  BPOF  multiplied  by  a  bandpass 
binary  amplitude  mask  with  a  low  spatial-firequen^  blo<^  radius  of  10  pixels 
and  a  high-spatial-firequency  cut-off  radius  of  60  pixels,  and  they  are  encoded 
to  allow  for  neural  network  implementation.  The  BPOF  filter  part  of  a  TPAF 
is  created  by  thresholding  the  real  part  of  the  Fourier  transform  of  the  target, 
which  implies  a  TLA  of  0  degrees  and  thus  a  symmetric  filter  (so  that  storing 
or  encoding  only  one-half  of  the  filter  values  is  necessary).  For  a  128  by  128 
image  this  thresholding  strategy  implies  a  need  to  store  5400  separate  filter 
values,  since  values  outside  the  10-60  pixel  radius  bandpass  are  set  to  zero. 
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Neural  networks  typically  require  ezeessiye  training  time  to  learn  this  number 
of  values.  A  discussion  of  this  coimem  is  given  in  Section  4. 

L2.  Nearal  networks  for  HAC  filter  aynthaaia 

A  large  number  of  filters  are  needed  to  accommodate  rotations,  scale 
changes,  and  other  distortions  in  the  input  scene  target.  Instead  of  correlating 
with  each  of  these  filters,  a  neural  network  can  be  used  to  synthesize  the  filter 
for  correlation  as  the  input  scene  target  distortions  evolve.  A  consequent 
reduction  in  processing  time  could  enable  feasibility  for  the  HAC  system  for 
real-time  target  recognition. 

Originally,  work  conducted  using  neural  networks  for  filter  synthesis 
used  network  inputs  firom  the  correlation  peak.  On  low  intensity  backgrounds 
this  technique  was  successful,  but  for  significantly  cluttered  backgrounds 
neural  network  performance  deteriorated,  even  for  strong  correlation  peaks 
[Olczak,  1991].  However,  the  research  reported  here  indicates  that  the  use  of 
gray-level  input  plane  intensity  values  centered  on  the  location  of  the 
correlation  peak  enables  acceptable  neural  network  filter  synthesis.  Studies  on 
the  use  of  input  plane  samples  for  neural  network  inputs  have  been  made 
[Olczak,  1991].  For  this  technique  to  be  successful,  an  appropriate  region  on 
the  input  scene  must  be  sampled.  Using  the  location  indicated  by  the 
correlation  peak  assumes  that  a  taz^et  has  already  been  recognized,  and  hence 
neural  network  filter  synthesis  is  intended  only  for  target  tracking.  However, 
if  the  correlation  peak  or  some  other  technique  indicates  the  location  of  a  blob- 


60 


Hkft  object,  then  a  xieural  network  can  be  used  to  confirm  that  this  object  is  or 
is  not  a  target. 

The  input  scene  must  be  sampled  to  produce  the  neural  network  inputs. 
A  variety  of  input  scene  sampling  techniques  were  investigated.  The  simplest 
consisted  of  25  pixel  values  on  a  5  z  5  grid  centered  on  the  target.  To 
incorporate  more  of  the  background  in  the  input  samples,  a  9  x  9  grid  was 
used,  and  two  separate  algorithms  were  emplo]^  to  produce  25  or  31  input 
values  for  the  neural  network.  The  choice  of  a  sampling  technique  is  important 
for  adequate  neural  network  inteipolation  between  samples.  Sampling 
techniques  are  discussed  further  in  Section  5. 

Two  different  types  of  neural  netwoiks  were  used  as  interpolators  to 
S3nithesize  filters:  the  well  known  back*propagation  neural  network  and  a 
recently  developed  "stretch  and  hammer"  neural  network.  The  general  idea 
was  to  input  enough  representative  examples  to  train  the  neural  networks  to 
approximate  a  desired  input/  output  behavior  for  filter  synthesis.  The  puts 
are  gray*level  pixel  values  firom  the  input  scene  to  be  binarized  and  coded  for 
the  input  plane  SLM.  The  outputs  are  the  coded  values  that  characterize  the 
TPAF  for  the  filter  plane  SLM.  There  are  advantages  to  using  the  stretch  and 
hammer  as  well  as  the  back-propagation  neural  network.  These  advantages 
and  an  in-depth  analysis  of  neural  network  design  is  presented  in  Section  6. 

Successful  neural  network  filter  synthesis  was  accomplished  using  both 
back-propagation  and  stretch  and  hammer  neural  networks.  A  variety  of  input/ 
output  relationships  were  established.  Testing  of  neural  network  performance 
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waa  accoiiq>lished  by  varying  input  scene  parameters  that  included  in-plane 
target  rotation  an^e  and  target  background  type.  The  results  are  presented 
and  tabulated  in  Section  7. 
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2.  HAC  SYSTEM  OPERATION 


The  HAC  is  an  electroH)ptical  pattern  recognition  system  that  integrates 
electronics  and  the  computational  power  of  lenses.  From  Fourier  Optics  it  is 
known  that  using  a  coherent  light  beam,  a  lens  transform  an  input  2'D 
pattern  into  its  Fourier  spatial  spectral  components  at  the  focal  plane  of  the 
lens.  Each  unique  input  object  (target)  has  associated  with  it  a  unique  Fourier 
spectrum.  By  filtering  the  spectrum  for  a  certain  object  in  the  image,  it  is 
possible  to  recognize  the  presence  of  the  object  This  process  is  known  as 
pattern  recognition  by  optical  conation,  and  can  be  implemented  using  the 
optical  set-up  in  Figure  2.1a. 

2.1.  Steps  in  HAC  operation 

The  first  step  in  the  operation  of  the  HAC  system  is  the  acquisition  of 
the  input  scenes.  The  images  used  in  this  research  are  gray-level  visible  and 
infirared  images  originating  finrn  two  sources:  the  University  of  Southern 
California  Image  Processing  Institute  Data  Base  (USC),  and  the  United  States 
Army  Center  for  Night  lesion  &  Electro-Optics  1987  Multi-Sensor  Field  Test 
(ARMY)  (see  Appendix  for  examples).  These  imaf  ^s  comprise  a  "test-woiid"  for 
research  purposes. 
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The  second  step  in  the  operation  of  the  HAC  system  is  processing  the 
available  input  scenes.  A  computer  is  used  to  binarize  the  images  and  to 
download  the  results  to  the  input  plane  SLM.  The  binarization  process  is 
discussed  in  Section  3.  The  binarized  image  is  stored  for  later  use  in  the 
process  of  optical  correlation  by  computer  simulation. 

The  third  step  in  the  operation  of  the  HAC  system  involves  hybrid 
electro*optical  functions.  As  shown  in  Figure  2.1a,  a  HeNe  laser  beam 
(wavelength  632.8nm)  is  expanded  and  collimated  to  illuminate  the  input  scene 
SLM.  The  limited  modulation  capability  of  the  SLM  requires  a  binary  image. 
This  SLM  acts  as  a  transparency  so  that  the  output  is  a  pure  amplitude 
function  encoding  the  binarized  image.  The  first  lens  generates  the  Fourier 
transform  of  this  function  at  the  filter  plane  SLM  (in  a  time  that  equals  the 
input-filter  plane  distance  divided  by  the  speed  of  light).  The  computer 
processor  retrieves  a  filter  firom  a  previously  stored  bank  of  filters.  The  filters 
are  ternary  phase  amplitude  filters  (TPAFs)  and  are  further  discussed  in 
Section  4.  These  filters  modulate  the  phase  using  two  phase  states  or  block  the 
light  completely.  The  second  SLM  implements  a  filter  by  simple  multiplication 
with  the  Fourier  transform  of  the  input  scene.  If  only  the  target  is  present  the 
output  of  the  filter  SLM  approximates  a  plane  wave  with  a  direction  related 
to  the  target  position.  The  second  lens  produces  the  Fourier  transform  of  the 
output  of  the  second  SLM  at  the  2-D  detector.  If  a  target  is  present  in  the 
input  scene,  then  a  plane  wave  is  the  output  of  the  second  SLM  and  a  delta 
function  or  bright  spot  appears  on  the  detector  at  the  target  location. 
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Step  four  of  HAC  system  operation  requires  a  decision  process.  The 
output  of  the  2>D  detector  is  processed  using  a  peak-finding  routine.  If  a  peak 
is  fouzul  that  is  distinguishable  firom  dutter,  then  target  recognition  is 
accomplished;  if  no  peak  is  foimd  then  the  processor  must  download  the  next 
filter  firom  the  bank  of  filters  to  the  filter  plane  SLM.  The  process  of  steps  3  & 
4  is  carried  out  until  either  a  target  is  recognized  or  aU  of  the  filters  are 
scaimed  with  no  recognition  accomplished.  There  exist  smart  filtering 
strategies  which  do  not  require  all  filters  to  be  scanned. 

2^  Neural  networks  in  ECAC  operation 

The  process  in  step  3  can  be  simulated  by  computer  using  appropriate 
digital  image  procesdng  algorithms.  The  Fast  Fourier  Transform  (FFT) 
algorithm  accomplishes  the  task  of  the  first  lens.  This  algorithm  inputs  the 
stored  real-valued  binarized  scene  (created  in  step  2  of  the  HAC  process)  and 
outputs  the  complex-valued  Fourier  transform  of  the  scene.  This  complex¬ 
valued  fonction  is  then  multiplied  pixel-by-pixel  with  one  of  the  filters  from  the 
filter  bank,  and  the  Inverse  Fast  Fotuier  Transform  (FFTT)  is  then  performed. 
The  squared  modulus  operation  is  performed  on  the  FFTI  output,  which 
simulates  the  2'D  detector  recording  (assuming  that  the  2-D  detector  responds 
linearly  with  irradiance).  This  output  is  then  processed  as  in  step  4  of  the  HAC 
system.  Once  again  the  steps  3  &  4  are  repeated  until  a  target  is  recognized 
or  the  bank  of  filters  is  exhausted. 

Typically,  extensive  processing  time  is  needed  not  only  to  scan  through 
the  filters  but  also  to  search  each  output  of  the  2-D  detector  array  for  a 
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corrdatioii  peak.  Figure  2.1b  illustrates  the  concept  of  incorporating  a  neural 
network  processor  in  the  HAC  system  to  reduce  the  number  of  correlations 
performed.  The  research  performed  here  assumes  that  the  location  of  a 
potential  target  is  known.  The  input  scene  is  gray-level  and  at  the  disposal  of 
the  neural  processor,  and  following  a  consistent  sampling  technique  using 
these  gray-level  values  a  neural  network  can  be  used  to  synthesize  a  filter  for 
correlation  in  a  short  time.  Only  one  filter  is  S3mthesized  by  the  network,  and 
hence  only  one  complete  correlation  is  necessary  to  determine  whether  or  not 
a  target  is  in  the  scene.  Selection  of  the  sampling  technique  is  important  and 
is  discussed  in  Section  5.  Selection  of  the  neural  network  also  affects  the 
results,  and  a  discussion  of  appropriate  neural  networks  is  given  in  Section  6. 
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Figure  2.1  Hybrid  adaptive  correlator  (a)  with  a  filter  bank,  (b)  with  a 

neural  processor. 
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3.  INPUT  SCENE  BINABIZAnON 


3.1.  Binarizatioii  techniques 

Two  important  image  storage  techniques  are  used  in  digital  image 
processing.  The  first  method  stores  an  image  by  coded  variations  in  intensity, 
which  is  accomplished  by  assigning  one  of  many  gray>level  intensity  values  to 
each  pixel  in  the  image.  This  type  of  coding  is  limited  by  the  dynamic  range  of 
the  acquisition  equipment  or  in  some  cases  by  the  storage  system.  The  pixels 
in  the  images  used  in  this  research  are  assigned  gray-level  pixel  values  finm 
0  (black)  to  255  (white).  Each  pixel  requires  one  hyte  of  storage  space.  For  an 
image  of  128  by  128  pixels  the  required  storage  space  is  16,384  bytes  (not 
including  124  bytes  for  the  image  file  header).  Many  optical  devices,  such  as 
the  MOSLiMs,  cannot  represent  a  gray  level  image  and  hence  a  more  restrictive 
coding  method  must  be  employed.  The  second  coding  method  thresholds  each 
pixel  at  some  value  between  0  and  255  so  that  the  resulting  image  file  is  filled 
with  either  O’s  (representing  "ofi")  or  255’s  (representing  "on")  at  each  pixel 
location. 

Gray-level  images  contain  more  target  inforxnation  and  it  would  be 
advantageous  to  keep  the  images  in  gray-level  format  for  use  in  the  optical 
correlator.  If  this  were  possible  the  binarization  prepro  ^  step  woiild  not 
be  necessary.  The  most  important  reason  for  binarizing  images  is  due  to 
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TMtrictions  on  the  SLMs  used  in  tha  input  axid  filter  plane  of  the  HAC  system. 
At  each  pixel  location  these  devices  either  block  the  light  completely,  let  the 
li^t  pass  throu^  unchanged,  or  phase  shift  the  light  by  180  degrees,  and  it 
is  not  possible  to  download  a  gray-level  image  to  such  a  device.  The  input 
plane  SLM  creates  a  pure  binary  amplitude  encoded  image  similar  to  the 
coding  that  could  be  obtained  by  placing  a  mask  with  transparent  and  opaque 
regions  at  the  SLM  locatioxL 

Studies  on  hinarization  tedbniques  have  been  made  [Johnson,  1991].  A 
simple  technique  uses  the  original  gray-levd  input  image  values  and 
thresholds  on  a  value  between  0  and  255.  The  binarized  image  has  different 
characteristics  for  different  threshold  values.  A  low  threshold  value  tends  to 
transform  a  large  number  of  the  jnzels  to  the  255  or  "on**  value  and  yields 
"blob-like"  binarized  image  characteristics.  A  high  threshold  value  tends  to 
result  in  a  binarized  image  with  less  mfozmation  and,  in  some  instances,  too 
little  information  to  recognize  a  t  Thus,  a  simple  thresholding  value  for 
hinarization  is  not  adequate  fox  implementation  in  the  HAC  system 
Histograms  of  pixel  values  for  typical  backgrounds  and  targets  in  Figures  3.1 
through  3.5  show  that  the  background  and  target  pixel  intensity  distributions 
commonly  overlap.  Thresholding  at  a  particular  value  tends  to  emphasize  the 
background  information  as  much  as  the  target  information.  Thus  another 
approach  which  separates  the  pixel  intensity  distributions  of  the  background 
and  target  before  binaiization  is  needed. 


70 


3J2.  MttarigatiQn  toftlmiqiift 

A  successful  fainaiizatioii  tedunque  has  been  developed  [Johnson,  1991]. 
In  <hia  tedinique  the  input  image  is  edge  enhanced  before  thresholding,  as 
described  below.  Edge  enhancement  is  accomplished  using  a  2  x  2  square  grid 
of  pixei  values  from  the  input  image.  The  maximum  difference  of  pixel 
intensity  within  the  grid  is  computed.  This  value  replaces  the  upper  left  comer 
value  of  the  grid  in  the  new  edge-enhanced  image.  Histograms  of  pixel  values 
of  backgrounds  after  edge  enhancement  illustrated  in  Figures  3.1  throuffo  3.5 
show  a  strong  shift  toward  a  low  intensity  distribution,  which  is  helpful  in 
selecting  a  threshold  leveL  Using  statistics  a  threshold  level  can  be  chosen 
such  that  a  large  portion  of  the  background  pixels  can  be  set  to  zero  while  not 
severely  degrading  the  target  This  level  is  sdected  using  the  expression 

Threshold  •  mean  *  SDM  *  standard  deviation^ 

where  SDM  is  defined  as  the  standard  deviation  multiplier.  Appropriate 
selection  of  the  SDM  yields  improved  results  for  optical  correlation  using  the 
peak-to-dutter  ratio  as  a  metric.  An  SDM  of  1.8  as  illustrated  in  Figure  3.6 
was  found  to  yield  the  best  results  using  the  peak-to-dutter  ratio  metric  and 
was  used  for  the  research  reported  here.  The  peak-to-dutter  ratio  is  the  ratio 
in  dedbds  of  the  energy  in  the  target  peak  to  the  energy  in  the  largest  dutter 
peak. 
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Figure  3.1 


(a)  Truck  target  with  histogram,  (b)  txoick  target  after 
edge  enhancement  with  histogram. 


(a)  Truck  on  dtyTO  background  with  histogram,  (b)  truck  on  dtyTO 
background  after  edge  enhancement  with  histogram. 


newbMO  badtground  with  histogram,  (b)  truck  on  newbkSO 
backgroimd  after  edge  enhancement  with  histogram. 


4.  FILTER  ENCODING 


4.1.  Encoding  requirements  and  techniques 

The  HAC  83rstem  uses  a  MOSLM  at  the  correlator  filter  plane.  Each 
pixel  in  the  SLM  may  be  operated  in  three  states:  two  states  that  rotate  the 
plane  of  polarization  approximately  h-  or  >  6  degrees  and  one  state  that  scatters 
the  light  out  of  the  system  [Ross»  1983;  Psaltis,  1984;  East,  1989].  Light 
incident  on  the  filter  plane  represents  the  Fourier  transform  of  the  binarized 
input  image.  Light  passing  through  the  input  plane  SLM  is  linearly  polarized, 
and  light  passing  through  the  filter  plane  SLM  is  analyzed  using  an  orthogonal 
polarizer.  Each  of  the  two  SLM  states  that  rotate  the  plane  of  polarization 
contains  a  component  along  the  pass  axis  of  the  analyzer.  There  is  a  relative 
phase  shift  of  180  degrees  between  the  two  states,  and  thus  a  relative 
amplitude  change  is  produced.  In  effect,  the  pixels  set  to  one  state  pass  the 
Fourier  transform  without  alteration,  and  the  pixels  set  to  the  other  state  alter 
the  Fourier  transform  by  a  phase  shift  of  180  degrees  or  amplitude 
multiplication  of  -1.  The  third  state  scatters  most  of  the  light  out  of  the  system 
and  thus  can  be  represented  by  zero.  Therefore,  using  polarized  light,  an  SLM, 
and  an  analyzer,  it  is  possible  to  encode  a  filter  with  amplitudes  of  -t-l,  *1,  and 
0  at  each  pixel  location.  The  TPAF  is  defined  by  these  values  and  thus  consists 
of  a  BPOF  multiplied  by  a  binary  amplitude  pattern. 
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A  TPAF  filter  for  pattern  recognition  is  generated  by  first  taking  the 
Fourier  transform  of  a  target,  which  is  accomplished  using  the  FFT  algorithm. 
In  general,  the  target  Han  a  complex-valued  Fourier  transform  so  that  each 
pixel  value  is  represented  by  a  point  in  the  complex  plane  (e.g.  a  128  by  128 
image  has  16,384  complex  valued  samples).  Prodiidng  a  BPOF  firom  these 
values  requires  choosing  a  line  through  the  complex  jdane  so  that  all  pixel 
values  on  one  side  of  the  line  are  assigned  a  phase  of  0  degrees  (amplitude 
unchanged)  and  all  values  on  the  other  side  are  assigned  a  phase  of  180 
degrees  (amplitude  multiplied  by  -1).  The  angle  that  this  line  makes  with  the 
imaginary  axis  is  the  Threshold  line  Angle  (TLA)  as  illustraiad  in  Figure  4.1 
[Flanneiy,  1988].  Varying  the  TLA  mak<^a  it  possible  to  achieve  improved 
correlation  performance  [Flannery,  1989].  A  TLA  of  0  results  in  a  cosine  BPOF 
that  is  symmetric  and  hence  reduces  the  storage  reqtiirements  of  the  system 
by  one-half,  so  that  only  8192  values  are  needed  to  characterize  the  BPOF 
portion  of  the  filter.  A  TPAF  filter  is  then  created  by  multiplying  the  BPOF 
filter  by  a  10-60  pixel  radius  bandpass  amplitude  pattern  which  sets  those 
pixels  outside  this  band  to  the  zero  state.  This  bandpass  was  chosen  to  reduce 
the  number  of  necessary  coded  values  needed  to  represent  the  filter. 

4^  Adopted  encoding  technique 

The  above  procediire  yields  an  excessive  number  of  outputs  for  neural 
network  implementation,  and  an  encoding  technique  is  needed  to  reduce  this 
number.  The  bandpass  or  binary  amplitude  pattern  for  a  128  by  128  pixel  filter 
has  a  low-spatial-firequency  block  radius  of  10  pixels  and  a  high-spatial- 
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frequent  cut-off  radius  of  60  pixels.  To  further  decrease  the  number  of  pixel 
values,  a  supexpixel  filter  is  produced  by  defining  3x3  superpixels  and  storing 
only  one  value  of  -hi  or  -1  per  supexpixel.  The  use  of  a  superpixel  filter  blurs 
the  impulse  response  of  the  filter,  but  the  correlation  performance  is  still 
acceptable.  A  superpixel  filter  allovra  fixr  compression  of  the  number  of  stored 
values  firom  8192  to  600,  which  is  suitable  for  neural  network  implementation 
on  desktop  computers.  When  the  filters  are  used  fi>r  correlation  each  of  the  600 
superpixel  values  is  expanded  to  form  the  nine  pixels  it  represents.  Figure  4.2 
illustrates  a  typical  128  by  128  BPOF  and  the  reduced  10-60  bandpass 
superpixel  filter  used  for  this  research  for  a  truck  at  0  degree  rotation. 

The  input  scenes  are  bmarixed  in  accordance  with  the  procedure 


described  in  Section  3.  This  procedure  uses  a  threshold  value  related  to  the 
mean  and  standard  deviation  of  the  edge-enhanced  input  image,  so  that  for 
different  backgrounds  the  target  is  binarized  using  different  thresholds. 
Variations  in  the  target  due  to  binarization  must  be  accounted  for  when 
creating  a  filter.  By  superimposing  the  target  on  different  backgrounds  and 
examining  the  image  after  binarization,  it  is  possible  to  create  a  binarized 
target  for  filter  generation  that  correlates  well  with  a  variety  of  backgrounds. 
The  approach  taken  here  binarizes  the  target  for  filter  generation  using  many 
SDM  multipliers.  Each  of  these  binarized  targets  is  used  to  generate  a  filter, 
and  each  of  these  filters  is  used  in  a  computer  simulated  correlation  with  the 
target  superimposed  on  a  variety  of  backgrounds. 
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The  SDM  multipliers  for  filter  generation  varied  firom  3.0  to  5.0. 
Different  SDM  multipliers  were  needed  for  filter  generation  because  in  this 
process  badtground  information  is  not  present.  The  binarization  process  was 
repeated  for  each  orientation  of  the  target  used  in  training  a  neural  network. 
The  peifoixnance  of  the  filter  generated  for  a  particular  SDM  may  be  good  for 
a  particular  background  but  poor  for  another.  By  examining  the  peak-to-clutter 
ratio  for  each  SDM  and  each  background^  it  was  possible  to  choose  an  SDM 
and  generate  a  filter  that  had  acceptable  correlation  performance  for  any  of  the 
backgrounds  illustrated  in  the  Appendix. 

The  choice  of  the  SDM  for  filter  generation  was  important  for  acceptable 
correlation  performance  and  varied  chaotically  with  rotation  of  the  target.  Thus 
it  was  necessary  to  fix  the  SDM  value  for  the  input  image  for  use  in  a  real 
system.  The  SDM  value  for  the  input  image  varied  slightly  firom  1.5  to  2.1,  but 
the  best  results  using  peak-to-clutter  as  a  metric  were  at  1.8.  Table  4.1  shows 
SDM  values  used  for  filter  generation  for  target  rotations  firom  0  to  90  degrees 
and  for  an  input  image  binaiization  SDM  of  1.8  that  produced  the  best 
correlation  performance.  For  0  degree  target  rotation  an  SDM  of  3.6  was  used 
to  produce  the  binarized  target  illustrated  in  Figure  4.3.  This  target  can  be 
compared  to  the  binarized  truck  in  the  input  scenes  illustrated  in  Figure  3.6. 
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4.4 
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88 

4.6 

90 

4.4 

Table  4.1  SDM  values  for  filter  generation  using  an  input  plane  SDM  of  1.8. 
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Figure  4.1 


Illustration  of  TLA  binarization  of  Fourier  transform. 
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Figure  4.2 


(a) 


(b) 


(a)  BPOF  made  from  the  binarized  Fourier  transform  of  a  truck  at 
0  degree  rotation,  (b)  the  reduced  10-60  bandpass  TPAF  made  from  (a). 
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Figure  4.3 


Truck  target  binarized  using  a  SDM  of  3.6. 
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6.  INPUT  SCENE  SAMPLING 


6.1.  Samplii^  requirements  and  techniques 

To  effectively  train  a  neural  network,  suitable  input  and  output 
variables  must  be  identified.  The  goal  in  this  researdi  is  to  simulate  the 
relationship  between  the  input  plane  and  the  optimum  filter  for  the  optical 
correlator.  This  relationship  may  be  established  by  determining  the  orientation 
of  a  target,  creating  an  appropriate  filter,  and  downloading  the  filter  to  the 
filter  plane  SLM.  This  process  may  be  too  slow  to  be  used  successfiiUy  in  the 
HAC  system.  However,  by  choosing  representative  examples  of  input  scenes  it 
is  possible  to  train  an  artificial  neural  network  to  learn  the  input/  output 
behavior,  and  this  learning  can  be  used  in  real  time  in  the  HAC  system. 

Neural  network  inputs  for  filter  synthesis  may  be  selected  in  many 
ways.  AU  neural  network  inputs  for  this  research  were  obtained  firom  the  input 
gray-level  image.  One  sampling  method,  illustrated  in  Figure  5.1,  used  the 
gray-level  pixel  values  in  a  5  x  5  grid  centered  on  the  target.  Ail  except  the 
center  pixel  value  changed  their  values  as  the  target  was  rotated,  and  on 
average  the  firaction  of  off-target  pixel  values  was  10-15  percent.  The  25 
intensity  values  were  firom  the  box  regions  indicated  on  the  images  and  in  the 
illustration.  For  all  backgrounds  there  was  a  discemable  difference  in  each  of 
the  25  intensity  values  for  each  two  degrees  of  rotation.  In  both  cases 
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illustrated,  the  upper  right  comer  of  the  sampling  grid  extended  into  the 
background,  and  pixel  values  on  the  background  were  constant.  Using  part  of 
the  background  for  neural  network  inputs  enabled  the  incorporation  of  noise. 
The  5x5  grid  sampling  technique  allowed  for  successful  filter  synthesis  using 
back-propagation  neural  networiu  if  gray-level  images  were  used  in  the  input 
plane  and  TPAFs  were  used  in  the  filter  plane. 

Unfortunately,  gray-level  input  plpne  images  can  not  be  implemented 
using  the  SLMs  described  for  the  HAC  qrstem.  A  simple  5x5  pixel  grid  does 
not  adequately  represent  input/  output  behavior  for  neural  network  training 
for  filter  synthesis  with  the  binary  input  images  that  must  be  used  in  the  HAC 
system.  Thus,  for  binary  correlator  input  plane  images  it  is  necessary  to  use 
a  different  input  plane  sampling  technique. 

5JS.  Adopted  sampling  technique  for  binary  correlator  input  images 

Two  techniques  were  developed  with  the  goal  of  training  a  neural 
network  to  perform  well  independent  of  background. 

One  technique  used  a  9  x  9  sampling  grid  centered  on  the  target  and  an 
algorithm  that  reduced  the  81  values  in  the  grid  to  25  values.  This 
transformation  of  81  to  25  values  is  illustrated  in  Figure  5.2.  The 
transformation  algorithm,  which  was  designed  to  yield  gradual  but  significant 
pixel  value  changes  as  the  target  was  rotated,  proceeds  as  follows.  For  each 
row  of  the  9x9  array  the  average  deviation  firom  the  average  row  value  is 
computed.  The  same  computation  is  performed  for  each  column,  diagonal 
direction,  and  center  5x5  grid  of  values.  The  total  number  of  values  generated 
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is  thus  25  (9  rows  9  cduxnns  +  6  diagonal  directions  +  center  5x5  grid  » 
25).  Examples  of  actual  backgrounds  with  the  sampling  grid  superimposed  are 
illustrated  in  Figure  5.3.  It  was  found  that  this  sampling  technique  allows  the 
network  to  perform  relatively  independently  of  angle,  but  the  technique 
degrades  when  tested  with  the  target  svq>erimpo8ed  on  different  backgroimds 
not  included  in  the  training  set. 

A  second  sampling  technique  was  developed  to  address  the  degradation 
problem.  This  technique  sections  the  9x9  grid  into  36  wedges  and  computes 
the  total  intensity  value  in  each  wedge.  The  horizontal  and  vertical  wedges 
depend  only  on  one  row  or  column  and  are  not  used  for  inputs  to  the  neural 
network,  thus  avoiding  the  problem  of  the  previous  sampling  technique  where 
all  inputs  depended  on  only  one  row  or  one  column.  The  32>value  input  plane 
sampling  technique  is  illustrated  in  Figure  5.4.  It  was  found  to  be  successful 
for  neural  network  interpolation  independent  of  target  backgroimd  or  noise. 
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(a) 


(b) 


(c)  (d) 

Figure  5.1  A  5  x  5  pixel  grid  superpixel  on  (a)  truck  on  cityTO  background 

at  a  rotation  of  0  degrees,  (b)  truck  on  cityTO  background  at  a 
rotation  of  20  degrees,  (c)  truck  on  trbkTO  background  at  a 
rotation  of  0  degrees,  and  (d)  truck  on  trbkTO  background  at  a 
rotation  of  20  degrees. 
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Pixel  used  for  negative  slope  diagonals 


Pixel  used  for  both  positive  and 
negative  slope  diagonals 


2  9x9  pixel  grid  used  to  obtain  25  input  values  for  neural  network. 
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Figure  5.3  A  9  x  9  pixel  grid  superpixel  on  (a)  truck  on  city70  background 

at  a  rotation  of  0  degrees,  (b)  truck  on  dtyTO  background  at  a 
rotation  of  20  degrees,  (c)  truck  on  trbk70  background  at  a 
rotation  of  0  degrees,  and  (d)  truck  on  trbk70  backgroiind  at  a 
rotation  of  20  degrees. 
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6.  NEURAL  NETWORK  DESIGNS 


6.1  Neural  network  architecture  and  training 

Neural  computing  attempts  to  use  architectures  and  processing 
techniques  similar  to  those  found  in  biological  neural  systems.  Biological  brains 
store  and  learn  information  naing  cells  called  neurons.  Each  neuron  has 
associated  with  it  input  dendrites  and  an  output  axon.  The  input  dendrites 
receive  chemical  stimuli  through  a  S3mapse  connection  from  many  other 
neurons  by  means  of  their  respective  axons.  If  enough  total  stimulus  is 
present,  the  neuron  "fires"  by  releasing  a  signal  along  its  axon.  The  strength 
of  this  signal  is  determined  by  the  incoming  stimuli  to  the  dendrites.  The 
reactions  in  biological  brains  are  chemical,  but  they  have  electrical  side  effects 
which  can  be  measxired.  Learning  is  accomplished  by  adapting  the  strength  of 
the  signal  carried  along  the  axon  to  other  neurons.  Memory  is  achieved  by 
storing  the  strengths  or  weights  of  the  neuron  interconnections.  Modeling 
neurons  requires  multiple  variable  inputs  (to  simulate  the  dendrites),  a 
transfer  function  (to  simulate  the  neural  firing  threshold),  and  multiple 
variable  outputs  (to  simulate  axon  strengths  connected  to  other  neurons).  By 
interconnecting  such  model  neurons  it  is  possible  to  simulate  processes  similar 
to  those  accomplished  by  biological  brains.  These  processes  are  extremely 
parallel  and  are  unlike  the  typical  Von  Neuman  processes  which  are  the  basis 
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for  modem  digital  computing.  Digital  computing  is  often  slow  and 
inappropriate  for  solving  problems  such  as  hitting  a  moving  ball  with  a  bat  or 
baddng  up  a  18  wheel  semi-trailer  truck  to  a  loading  dock. 

In  general,  neural  networks  are  used  to  establish  input/  output 
relationships  that  are  not  easily  described  by  rules.  If  rules  can  be  identified 
for  mapping  the  input  to  the  output,  then  digital  computing  should  be 
applicable.  Neural  networks  generate  their  own  rules  by  learning  firom 
examples.  The  neural  networks  employed  here  use  supervised  learning  and 
require  adaptation  of  the  neuron  intercoimection  weights. 

6.2.  Back-propagation  neural  network 

The  back-propagation  neural  network  illustrated  in  Figure  6.1  uses  a 
learning  algorithm  based  on  reducing  the  error  between  the  actual  output  of 
the  network  and  the  desired  output.  Error  reduction  is  accomplished  by 
modifying  the  neuron  interconnection  weights.  A  back-propagation  neural 
network  has  at  least  three  layers:  an  input  layer,  an  output  layer,  and  a 
"hidden"  layer  typically  not  connected  to  any  inputs  or  outputs.  It  is  feed¬ 
forward,  which  means  that  the  outputs  firom  any  layer  are  never  fed  back  to 
previous  layers.  The  back-propagation  neural  network  uses  delta-rule  learning, 
which  is  a  gradient  decent  procedure  that  adjusts  the  interconnection  weights 
by  minimizing  the  sum  of  the  squared  differences  between  the  actual  neural 
network  output  and  the  desired  output.  For  an  output  layer  of  k  neurons  this 
function  is 
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where 


is  the  desired  output  of  the  k  th  neuron 
Yik  is  the  actual  output  of  the  k  th  neuron. 


The  back-propagation  neural  network  uses  supervised  learning,  which  means 


that  dfi  is  known.  The  value  of  is 


where  f(u)  is  typically  the  sigmoidal  transfer  function 


/(V) 


1  +  e 


The  first  derivative  of  this  function  is 


/(V)  =  fiy)  [1  -  fly)l 


A  typical  argument  of  this  function  is 


J 

where  w^  is  the  interconnection  weight  of  the  j  th  hidden  neuron 
to  the  k  th  output  neuron 
yj  is  the  value  of  the  j  th  hidden  layer  neuron. 
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yj  is  given  by 


y,'f^ 

where  ^  is  the  sum  of  the  weighted  outputs  of  the  input  layer  or 

^  •  E 

where  w^  is  the  interconnection  wei|^t  of  the  i  th  input  neuron  to 

the  j  th  hidden  neuron, 
y,  is  the  i  th  input. 

Initially  the  interconnection  weiid^ts  are  set  to  small  random  values.  To 
minimize  the  sum  of  squared  differences  between  the  actual  neural  network 
outputs  and  the  desired  outputs,  a  ddta  weight  is  determined.  For  hidden-to- 
output  layer  weights  the  change  in  weights  as  given  in  Figure  6.2  is 

AWj,  •  r\  6^yj 

where  q  is  a  gain  constant  which  controls  the  strength  of  the 
wei^t  change 
is  defined  in  Figure  6.2. 

For  input^to-hidden  layer  weights  the  change  in  weights  as  given  in  Figure  6.2 
is 


Figure  6.2  illustrates  the  gradient  descent  technique  used  to  adjust  the 
interconnection  weights.  It  is  advantageoiis  to  increase  the  learning  rate  by 
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adding  a  momentum  tenn  to  the  delta  weights  [Rumenhart  1986].  For  hidden* 
to*ou^ut  layer  weights  the  new  delta  wei^t  with  momentum  is 


AWji  ■  n  *  a  Any  (n) 

where  n  is  the  iteration  number 

a  is  the  momentum  constant,  which  controls  the  strength 
of  the  weis^  diange  in  terms  of  past  weight  changes. 
For  input*to-hidden  layer  wei^ts  the  new  delta  wei^t  with  momentum  is 


Aw^  (»+l)  -  t| 


(n) 


Presenting  enough  representative  examples  of  the  known  input/  output 
behavior  establishes  the  patterns  used  to  interpolate  or  approximate  the 
outputs  for  inputs  not  used  in  training  [lipmann  1987].  These  patterns  are 
stored  in  the  interconnection  weights. 
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6^  Stretch  and  hammer  neural  network 


The  stretch  and  Hawimny  neural  network  illustrated  in  Figure  6.3  is 
«^«ai£rnAd  to  interpolate  the  training  data  fitting  a  continuous  hyper^fiinction 
of  dimensionali^  equal  to  the  number  of  inputs.  This  xietwork  is  used  for  filter 
synthesis  as  fidlows.  First,  the  input  space  is  "stretched"  into  the  principal 
copqwneTit  space.  For  the  138  input  vectors  of  32  valum  each  used  to  train  the 
network,  a  138  X  32  trainisg  data  matrix  is  fiurmed  and  pre-multiplied  by  its 
transpose: 
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where  z/  is  the  i  th  component  of  the  j  th  training  example  with  each 
component  scaled  so  that  its  mean  (for  all  training  examples)  is  zero.  The 
eigenvalues  Xg, ...,  Xgg  of  this  matrix  are  obtained  as  the  solutions  of 
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Solving  the  simultaneous  equations 
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for  each  eigenvalue  yields  each  orthogonal  32-dimensional  principal 
components  axis  (with  components  U|),  where  each  axis  is  scaled  in  units  of  its 
eigenvalue.  Next,  a  least  squares  hyper-plane  is  fit  to  these  points  using 
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where  Uj*  represent  the  inputs  in  the  zero-mean  principal  component  space, 
are  the  outputs  for  these  inputs,  bo  is  the  intercept  and  the  remaining  b^’s  are 
the  slopes  of  the  32-dimensional  hyper-plane.  The  are  determined  by  solving 
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the  above  simultaneoue  equations.  The  least  squares  hyper>plane  will  not  in 
general  match  any  of  the  outputs.  To  produce  a  perfect  match  at  the  outputs, 
a  radial  Gaussian  liammer"  is  used  to  deform  the  hyper-plane  until  it 
intersects  each  utput  value.  This  deformation  is  accomplished  using  the 
following  matrix  algebra: 


'  ' 

A  A 

-/r 

m 

A  A 

9 

••• 

••• 

fm  fm 

where  ^  «  exp{[-l/(28,)][(i;^'-UjY+(u,*-Ui*)“+...+(u,”-Ui“)*]}  for  i,  j  *  1,  2, ...» 138, 
z  'i  are  the  outputs  given  by  the  least  squares  hyper-plane,  are  the  known 
training  outputs,  s^  are  the  standard  deviations  of  the  radial  Gaussians,  and 
Cj  are  the  radial  Gaussian  hammer  weights.  The  diagonal  elements  of  this 
matrix  are  unity,  and  the  standard  deviations  of  the  Gaussian  hammers  are 
selected  such  that  the  off-diagonal  elements  of  each  column  add  to  a  value 
slightly  less  than  unity.  The  q  may  be  obtained  using  Gaussian  elimination, 
singular  value  decomposition,  or  any  of  several  other  techniques  for  solving 
simultaneous  linear  equations.  The  stretch  and  hammer  neural  network  is 
designed  to  exactly  match  every  training  output  and  to  smoothly  generalize  for 
all  other  outputs. 
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6.4  Comparifon  of  ■tretch  and  hammer  and  back-propagation 

There  are  important  diflferencea  in  die  architectures  of  the  stretch  and 
hnmmoir  and  the  back-propagation  neural  networks.  The  stretch  ai^  hammer 
neural  network  has  a  number  of  adjustable  parameters  that  depends  on  the 
number  of  training  exanqdes.  The  adjustable  parameters  for  a  back- 
propagaticm  neural  network  depend  on  the  number  of  neurons  employed.  A 
back-propagation  neural  network  with  k  input  neurons,  m  hidden  layer 
neurons,  n  output  neurons,  and  no  interlayer  interconnections  has  km  -t-  mn 
+  m  -f  n  adjustable  parameters.  A  stretdi  and  hammer  neural  network  has  n(k 
1  -I-  p)  adjustable  parameters,  where  p  is  the  number  of  examples  used  for 
training.  A  31  input,  200  neuron  hidden  layer,  600  output  back-propagation 
neural  network  with  50%  of  the  interconnections  randomly  removed,  as  was 
used  for  this  research,  has 

0.5«(31«200  ^  200  *  600<^200-^600)  -  53,300 

adjustable  parameters.  This  value  is  independent  of  the  number  of  training 
examples.  Using  138  traiuing  examples,  as  was  the  case  for  this  research,  a 
stretch  and  hammer  neural  network  has 

600  *  (31  W  138)  -  102,000 

adjustable  parameters. 

To  compare  performance  of  the  two  networks,  a  simple  exercise  involving 
rotation  estimation  rather  than  filter  synthesis  was  carried  out  for  which  both 
networks  had  the  same  munber  of  adjustable  parameters.  A  training  set  of  5 
examples  with  one  input  and  one  output  was  used,  and  the  back-propagation 
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neural  network  had  two  hidden  neurona.  The  number  of  adjustable  parameters 
for  both  neural  networks  was  therefore 

1*(1  ♦  1  ♦  5)  •  (1*2  +  2*1  2  +  1)  •  7. 

The  input  was  the  average  gray  level  value  for  one  quadrant  of  a  9  z  9  pixel 
grid,  and  the  output  was  the  in-plane  angular  rotation  of  the  target  The 
examples  used  for  training  were  at  rotation  angles  of  0,  8,  16,  24,  and  30 
degrees,  and  testing  results  were  obtained  at  anises  of  0, 1, ...,  30  degrees. 

The  back-propagation  neural  network  results  varied  as  the  learning 
process  progressed.  This  variation  is  illustrated  in  Figure  6.4,  which  shows  the 
best  performance  for  an  RMS  training  error  of  3.5%.  After  training  the  stretch 
and  hammer  neural  network  yielded  approximately  .5  degree  average  error  for 
testing  angles.  At  3.5%  RMS  training  error,  the  back-propagation  neural 
network  testing  results  approached  those  of  the  stretch  and  hammer  neural 
network  as  illustrated  in  Figure  6.5.  When  the  back-propagation  neural 
network  was  allowed  to  continue  training  to  an  RMS  training  error  of 
approximately  0% ,  the  results  show  a  larger  error  for  testing,  although  at  the 
training  examples  the  error  is  near  zero  as  illustrated  in  Figure  6.6.  Thus  the 
back-propagation  neural  network  does  well  at  retrieving  the  training  examples 
but  poorly  at  interpolating  the  testing  examples.  The  stretch  and  hammer 
neural  network  retrieves  the  training  examples  exactly  by  definition,  and  in 
the  above  exercise  interpolates  better  than  the  ba(^-propagation  neural 
network  for  the  testing  examples. 


101 


Figure  6.1  (a)  A  back-propagation  neural  network  with  31  inputs,  200 

hidden-layer  ne\irons,  and  600  output  neurons.  0>)  simple 
processing  element  where  w^  -  weighted  connections, 
k  -  neuron  number  in  current  layer,  j  -  neuron  number  of 
previous  layer. 
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Figure  6.2  Gradient  descent  technique  used  to  determine  changes  in 
interconnection  weights. 
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Figure  6.3  Stretch  and  hammer  neural  network  with  m  examples  and  n 
inputs. 
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Average  lesiing  error  (deg) 
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RMS  Mning  error  when  tested  (%) 


Figure  6.4  Comparison  of  average  error  (deg)  for  target  rotation  estimation. 

The  badc'propagation  neural  network  was  trained  to  various  RMS 
training  error  values.  The  stretch  and  hammer  neural  network 
average  error  for  testing  was  approximately  0.5  degrees. 
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Figure  6.5  Comparison  of  error  (deg)  for  target  rotation  estimation  for  a 
back-propagation  RMS  training  error  of  3.5%  with  stretch  and 
hammer  results. 
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Figure  6.6  Comparison  of  error  ((ieg)  for  target  rotation  estimation  for  a 
back'propagation  RMS  error  of  0%  with  stretch  and  hammer 
results. 
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7.  FILTER  SYNTHESIS  RESULTS 


The  results  of  four  neural  network  filter  synthesis  cases  are  discussed 
in  this  section.  The  neural  network  outputs  for  eadh  case  are  the  600  coc 
filter  values  described  in  Section  4.  The  fiirst  ease  uses  a  back>propagatioii 
neural  network  and  25  intensity  values  firom  a  5  z  5  target-centored  pixel  grid 
as  inputs.  The  second  case  uses  a  back-propagation  neural  network  and  25 
computed  values  from  a  9  z  9  tai^t-centered  pixel  grid  as  inputs  as  described 
in  Section  5.  The  third  case  uses  a  back-propagation  neural  network  and  32 
computed  values  firom  a  9  z  9  target-centered  pixel  grid  as  inputs  as  described 
in  Section  5.  The  fourth  case  uses  a  stretch  and  hammer  neural  network  and 
the  same  inputs  as  the  third  case.  The  networks  were  tested  by  varying  one  or 
both  of  two  input  scene  parameters:  the  angle  of  target  rotation  and  the  type 
of  cluttered  background.  For  the  gray-level  correlator  input  scenes  the  angles 
were  offret  by  2  degrees,  and  for  the  binarized  correlator  ixq)ut  scenes  the 
an^es  were  offset  by  1  degree.  Varying  the  angle  of  rotation  permitted 
investigation  of  the  performance  of  the  network  in  interpolating  between 
training  angles.  Varying  the  type  of  background  permitted  investigation  of  the 
robustness  of  the  network.  The  goal  was  to  train  the  network  to  ignore 
background  effects. 
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7.1.  Correlation  peak  metrics 

Two  correlation  peak  metrics  were  used  to  evaluate  neural  network 
performance:  a  peak*to-sidelobe  ratio  and  a  peak>to-clutter  ratio.  A  ratio  of  less 
than  3  dB  is  inadequate  for  use  in  the  HAC  83rstem.  The  correlation  plane  is 
expected  to  exhibit  a  peak  near  the  target  center.  For  all  cases  considered  here 
the  target  was  superimposed  on  the  center  of  a  background  scene.  For  gray 
level  correlator  input  scenes  a  peak  finding  algorithm  was  used  to  locate  peaks 
in  three  regions  of  the  correlation  plane:  region  1  was  a  target>centered  5x5 
pixel  grid,  region  2  was  a  target^mitered  15  x  15  pixel  grid  excluding  region 
1,  and  region  3  was  the  remaining  area  of  the  128  by  128  pixel  correlation 
plane.  For  binary  correlator  input  scenes  a  peak  finding  algorithm  was  used 
to  locate  peaks  in  two  regions  of  the  correlation  plane:  region  1  was  a  target* 
centered  5x5  pixel  grid,  and  region  V  was  the  remaining  area  of  the 
correlation  plane.  A  peak-to*sidelobe  ratio  measurement  was  made  for  gray- 
level  correlator  input  scenes  by  comparing  the  highest  peak  in  region  1  with 
the  hipest  peak  in  region  2.  A  peak-to-dutter  ratio  measurement  was  made 
by  comparing  the  highest  peak  in  i«gion  1  with  the  highest  peak  in  region  3 
for  gray-level  correlator  input  scenes  and  with  the  hipest  peak  in  region  2'  for 
binary  correlator  input  scenes.  Peak-to-sidelobe  ratios  were  smaller  than  peak- 
to-dutter  ratios  for  the  gray  level  input  scenes. 
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7.2.  Back-propagation  neural  network  results 
(6  z  5  grid,  peak-to-sidelobe) 

Using  gray  level  correlator  inputs  and  a  TPAF  in  the  filter  plane,  a 
back-propagation  neural  network  with  25  inputs,  200  hidden  neurons,  and  600 
ou^uts  successfully  synthesised  filters.  The  hidden  neurons  had  sigmoidal 
transfer  functions,  and  the  output  neurons  had  summation  transfer  functions. 
The  network  was  trained  on  a  set  of  180  input  scenes  of  128  by  128  pixels,  90 
of  which  corresponded  to  the  trudc  target  rotated  at  angles  of  0,  4,  356 

degrees  on  a  uniform  gray  127  intmrsity-ievel  background.  The  remaining  90 
scenes  corresponded  to  the  truck  at  angles  of  0,  4,  ...,  356  degrees  on  a  dty 
background.  A  single  filter  was  produced  for  each  rotation  angle  as  described 
in  Section  4.  The  filters  were  made  from  the  binarized  Fourier  transform  of  the 
truck  on  a  gray-level  background  only.  Thus,  there  were  only  90  output  filter 
examples,  one  for  each  training  rotation  angle.  Two  backgrounds  were  used  so 
that  the  neural  network  could  be  trained  to  ignore  pixel  values  that  were  not 
on  the  target  (i.e.,  that  were  backgroimd  pixels). 

Figures  7.1  through  7.5  show  graphs  of  the  peak-to  sidelobe  ratio  (in  d3) 
versus  target  rotation  (in  degrees)  for  a  variety  of  cases.  These  results  are 
summarized  in  Table  7.1.  The  inputs  were  the  25  intensity  values  from  a 
target-centered  5x5  pixel  grid.  The  outputs  were  the  600  coded  values  used 
to  generate  a  10-60  bandpass  3x3  superpixel  TPAF.  Since  the  peak-to-dutter 
ratio  was  always  higher  than  the  peak-to-sidelobe  ratio,  network  performance 
assessment  was  based  on  the  peak-to-sidelobe  ratio. 
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The  "zero  degree  filter"  curve  shows  that  more  than  one  fixed  filter  is 
needed  to  accommodate  target  rotation.  The  filter  for  this  curve  was  symmetric 
and  was  constructed  firom  the  binarized  Fourier  transform  of  the  truck  on  a 
127  gray-level  background  at  a  rotation  angle  of  zero  degrees.  Performance  for 
this  filter  depends  on  the  background.  It  is  apparent  firom  Figures  7.1  through 
7.5  that  the  peak-to-sidelobe  ratio  decreases  rapidly  for  only  a  few  degrees  of 
rotation:  target  rotation  by  more  than  approximately  5  degrees  decreases  the 
correlation  peak  to  the  noise  level 

The  "best  expected  filter  firom  network"  curve  shows  the  upper  limit  for 
the  peak-to-sidelobe  ratio  because  the  trained-for  filter  is  used  to  correlate  with 
the  input  scene.  Figures  7.1  through  7.5  show  that  the  peak-to-sidelobe  ratio 
varies  as  the  target  is  rotated,  but  a  relatively  high  value  is  typically 
maintained. 

The  "filter  synthesized  by  network"  curve  shows  performance  when  the 
filter  synthesized  by  the  network  is  used  to  correlate  with  the  input  scene. 
Performance  depends  on  variations  in  scene  parameters  used  to  test  the 
network  and  is  summarized  in  Table  7.1. 

In  general,  when  the  neural  network  was  tested  for  backgrounds  used 
in  training,  the  network  performed  very  well  and  almost  matched  the  best 
expected  performance.  When  the  angle  of  target  rotation  was  offset  by  2 
degrees  the  network  performance  degraded  sli^tly.  However,  for  the  city 
background  (which  was  used  in  training)  the  network  performance  was  still 
close  to  the  best  expected  performance.  For  this  case  only  one  input  parameter 
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was  varied,  namely  the  rotation  anid^-  As  expected,  when  the  background  was 
also  varied  the  network  performance  degradation  increased,  but  performance 
was  adequate  for  pattern  recognition  by  optical  correlation  in  all  but  12  out  of 
270  testing  cases.  In  these  12  cases  the  peak>to-sidelobe  ratio  was  below  the 
3  dB  line,  which  is  not  acceptable  for  use  in  the  correlator. 
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Background 

Angles  of 
Rotation 

Comments 

Training- 

Gray 

0,  4, 360 

The  perfinrmance  of  the  network  matches 
the  best  expected  performance  almost 
exactly. 

CJity 

0,  4, ...,  360 

The  perfinmance  of  the  network  matches 
the  best  expected  performance  almost 
exactly. 

Testing- 

City 

2,  6, ...,  358 

The  general  performance  of  the  network 
follows  the  best  e:q>ected  performance. 
Performance  drops  below  the  3  dB  line 
for  rotation  angles  of  66,  242,  258,  and 
322  degrees. 

Bushy 

2,  6, ...,  358 

The  performance  of  the  network  is 
slightly  degraded.  Performance  drops 
below  the  3  dB  line  for  rotation  angles 
of  130  and  310  degrees. 

Newbk80 

2,  6, ...,  358 

The  newbk80  background  camouflages 

the  truck  and  a  reduction  in  the  peak-to- 
sidelobe  ratio  is  expected.  The  overall 
performance,  however,  indicates 
acceptable  peak-to-sidelobe  ratios  for 
target  recognition.  Performance  drops 
below  the  3  dB  line  for  rotation  angles 
of  78,  96,  118,  122,  130,  and  210 
degrees. 


Table  7.1  Shows  the  performance  of  a  filter  synthesized  by  a  back- 
propagation  neural  network  using  a  5  x  5  sampling  grid  and  the 
peak-to-sidelobe  metric  for  a  truck  rotated  on  a  variety  of 
backgrounds. 
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Figure  7.1  Filter  synthesis  results  using  a  bads-propagation  neural  network 
for  the  truck  on  the  gray  background  at  truck  rotation  an^es  of 
0,  4, 360  degrees.  The  network  was  trained  on  both  gray  and 
city  backgrounds  at  truck  rotation  anodes  of 0, 4, ....  360  degrees 
using  as  inputs  25  intensity  values  firom  a  target-centered  6x5 
pizd  grid. 
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Figure  7.2  Filter  synthesis  results  using  a  back>propagation  neural  network 
for  the  truck  on  the  city  background  at  truck  rotation  angles  of 
0,  4, 360  degrees.  The  network  was  trained  on  both  gray  and 
city  backgrounds  at  truck  rotation  anid^s  of  0,  4, ...,  360  degrees 
using  as  inputs  25  intensity  values  finom  a  tai^t-centered  5x5 
pixel  grid. 
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Figure  7.3  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  dty  background  at  truck  rotation  anises  of 
2,  6, 358  degrees.  network  was  trained  on  both  gray  and 
dty  backgrounds  at  truck  rotation  an^es  of  0, 4, ....  360  degrees 
using  as  inputs  25  intensity  values  from  a  target-centered  €  X  5 
pixel  grid. 
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Figure  7.4  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  bushy  background  at  truck  rotation  angles  of 
2,  6, ....  358  degrees.  The  network  was  trained  on  both  gray  and 
dty  backgrounds  at  truck  rotation  angles  of  0,  4, ...,  360  degrees 
using  as  inputs  25  intensity  values  finm  a  target-centered  5x5 
pixel  grid. 
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Figure  7.5  Filter  synthesis  resists  using  a  back-propagation  neural  network 
for  the  truck  on  the  newbkSO  background  at  truck  rotation  angles 
of  2,  6, ...»  358  degrees.  The  network  was  trained  on  both  gray 
and  dty  backgrounds  at  truck  rotation  an^es  of  0,  4, ....  360 
degrees  using  as  inputs  26  intensity  values  from  a  target-centered 
5x5  pixel  grid. 
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7^  Back-propagation  neural  network  results 
(9x9  grid,  peak-to-clutter) 

Using  binarized  correlator  inputs  and  a  TPAF  in  the  filter  plane,  a  back- 
propagation  neural  network  with  25  inputs  (computed  from  the  gray-level 
input  image),  200  hidden  neurons,  and  600  outputs  successfully  synthesized 
filters.  The  neural  network  was  trained  on  a  set  of  138  input  scenes  of  128  by 
128  pixels,  46  of  which  corresponded  to  the  trudc  rotated  at  an^es  of  0,  2, ..., 
90  degrees  on  a  uniform  gray  66  intensity-level  background,  46  of  which 
corresponded  to  the  truck  rotated  at  an^es  of  0,  2, ...,  90  degrees  on  a  tmiform 
gray  142  intensity-level  background,  and  the  remaining  46  of  which 
corresponded  to  the  truck  rotated  at  angles  of  0,  2,  ...,  90  degrees  on  a  city 
background.  A  single  filter  was  produced  for  each  rotation  angle  as  described 
in  Section  4.  The  filters  were  made  firom  the  binarized  Fourier  traxisform  of  the 
truck  on  a  blank  background  only.  Thus,  there  were  only  46  output  filter 
examples,  one  for  each  training  angle.  Three  backgrounds  were  used  so  that 
the  neural  network  could  be  trained  to  ignore  pixel  values  that  were  not  on  the 
target. 

Figures  7.6  through  7.13  show  graphs  of  the  peak-to-dutter  ratio  (in  dB) 
versus  target  rotation  (in  degrees)  for  a  variety  of  cases.  These  results  are 
summarized  in  Table  7.2.  The  inputs  were  25  values  Grom  a  taz^et-centered  9 
X  9  pixel  grid  as  described  in  Section  5.  The  outputs  were  the  600  coded  values 
used  to  generate  a  10-60  bandpass  3x3  superpixel  TPAF.  The  "zero  degree 
filter",  'best  expected  filter",  and  "filter  synthesized  by  network”  curves  have 
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the  same  meaning  as  in  Section  7.2.  Performance  depends  on  variations  in 
scene  parameters  used  to  test  the  network  and  is  summarized  in  Table  7.2. 

When  tested  on  the  city70  background  at  angles  of  0,  2, ...,  90  degrees 
(which  were  used  in  training)  the  neural  networic  performance  almost  matched 
the  best  expected  performance.  When  different  backgrounds  were  used  at  the 
training  rotation  angles,  the  network  performance  degraded  slightly.  As 
expected,  at  testing  rotation  angles  (1,  3,  ...,  89  degrees)  the  network 
performance  degradation  increased.  However,  network  performance  was 
adequate  for  pattern  recognition  (above  the  3  dB  line)  by  optical  correlation  in 
all  but  7  out  of  318  testing  cases. 
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Background 

Anises  of 
Rotation 

Comments 

Ikaining- 

City70 

0,  2, ....  90 

The  perfinxnanoe  of  the  network  matdies 
the  best  expected  perfonnance  almost 
exactly. 

Testing 

City70 

1,  3, ....  89 

The  general  perfonnance  of  the  network 
follows  the  best  e:q)ected  perfonnance. 
Performance  drops  below  the  3  dB  line. 
For  target  rotation  angles  of 25, 37,  and 
83  degrees. 

Trbk70 

0,  2, ...,  90 

The  general  performance  of  the  network 
closely  follows  the  best  expected 
perfonnance. 

Trbk70 

1,  3, ....  89 

The  performance  of  the  network  is 
adequate  for  pattern  recognition. 

NewbkSO 

0,  2, ...,  90 

The  general  performance  of  the  network 
closely  follows  the  best  expected 
perfonnance. 

NewbkSO 

1>  3; ...)  89 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 
Performance  drops  below  the  3  dB  line 
for  a  rotation  angle  of  25  degrees. 

Bushy 

0,  2, ...,  90 

The  general  performance  of  the  network 
closely  follows  the  best  expected 
performance.  Performance  drops  below 
the  3  dB  line  for  a  rotation  angle  of  66 
degrees. 

Bushy 

1,  3, ....  89 

The  general  performance  of  the  network 

dosely  follows  the  best  expected 
perfonnance.  Perfonxiance  drops  below 
the  3  dB  line  for  rotation  angles  of  65 
and  81  degrees. 

Table  7.2  Shows  the  performance  of  a  filter  synthesized  by  a  back- 
propagation  neural  network  using  a  9  x  9  sampling  grid  and  the 
peak-to-clutter  metric  for  a  truck  rotated  on  a  variety  of 
backgrounds. 
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Figure  7 .6  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  city70  background  at  truck  rotation  angolg  of 
0,  2,  ...,  90  degrees.  The  netwoi^  was  trained  on  66  gray  level, 
142  gray  level,  and  dtyTO  backgrounds  at  truck  rotation  ang^og 
of  0,  2, ...,  90  degrees  using  as  inputs  25  computed  values  firom 
a  target-centered  9x9  pixel  grid. 
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Figure  7.7  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  city70  background  at  truck  rotation  angels  of 
1,  3, ....  89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  dtylO  backgrounds  at  truck  rotation  angles 
of  0,  2, ...,  90  degrees  using  as  inputs  25  computed  values  finm 
a  target-centered  9x9  pixd  grid. 
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Figure  7.8  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  trbkTO  background  at  truck  rotation  angels 
of  0, 2,  ...f  90  degrees.  Ihe  network  was  trained  on  66  gray  level, 
142  gray  levd,  and  ci^O  backgrounds  at  truck  rotation  an^es 
of  0, 2, ...,  90  degrees  using  as  inputs  25  computed  values  from  a 
target-centered  9x9  pixel  grid. 
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Figure  7.9  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  trbkTO  background  at  truck  rotation  angels 
of  1, 3, 89  degrees.  'Rie  network  was  trained  on  66  gray  level, 
142  gray  level,  and  dty70  backgrounds  at  truck  rotation  anises 
of  0,  2, ...,  90  degrees  as  inputs  using  25  computed  values  from 

a  target-centered  9x9  pixel  grid. 


Peak-lo-Cluiier  Rato  (dB) 


■ 

• 

■ 

vlA _ 

_ h _ 

_ 

i _ 

ii 

B 

k~i:  r 
Whi  i\ 

V  VN 

B 

m 

m 

B 

sms 

sRI 

— 1 - 1 

• 

■  1 

.  1 

1 

i7 

V 

- 1 - 

VJ 

* 

.  i 

• 

:  \ 

A 

/  \ 

• 

I 

_ 1 

• 

fcl  ■  fc  I  !■  Jb 

Zaie  Dagraa  FMar 

i 

1SL5 


10.0 


7.5 


5.0 


2.5 


•2.5 


-5.0 


10  20  30  40  50  60 

In-Plm  Raation  ofTnick  T«gat  (dag) 


70 


60 


90 


Figure  7.10  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  newbkSO  background  at  truck  rotation  angels 
of  0,  2, 90  degrees.  Hie  network  was  trained  on  66  gray  level, 
142  gray  level,  and  d^O  backgrounds  at  truck  rotation  an^es 
of  0,  2, ...,  90  degrees  using  as  inputs  25  computed  values  from 
a  target-centered  9x9  pixel  grid. 
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Figure  7.11  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  newbkSO  background  at  truck  rotation  angels 
of  1, 3, 89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  dtylO  backgrounds  at  truck  rotation  angles 
of  0,  2, ...,  90  degrees  using  as  inputs  25  computed  values  from 

a  target-centered  9x9  pixd  grid. 
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Figure  7.12  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  bushy  background  at  truck  rotation  angels  of 
0,  2, 90  degrees.  The  network  was  trained  on  66  gray  levd, 
142  gray  level,  and  dtyTO  backgrounds  at  truck  rotation  anises 
of  0,  2, ...,  90  degrees  using  as  ixqjuts  25  conq)uted  values  firom 
a  target-centered  9x9  pixel  gricL 
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Figure  7.13  Filter  s^mthesis  results  usixig  a  back-propagation  neural  network 
for  the  truck  on  the  bushy  background  at  truck  rotation  angels  of 
1,  3, ...,  89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  cityTO  backgrounds  at  truck  rotation  an^^es 
of  0,  2, ...,  90  degrees  using  as  inputs  25  computed  values  from 

a  target-centered  9x9  pixel  grid. 
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7.4.  Back-propagation  neural  networic  results 
(32  wedges,  peak-to-dutter) 

Using  binarized  correlator  inputs  and  a  TPAF  in  the  filter  plane,  a  back- 
propagation  neural  network  with  32  inputs  (co^^>uted  firom  the  gray-level 
input  image),  200  hidden  neurons,  and  600  outputs  successfully  synthesized 
filters.  The  neural  network  was  trained  on  a  set  of  138  input  scenes  of  128  by 
128  inxels,  as  described  in  Section  7.3. 

The  back-propagation  neural  network  discussed  in  Section  7.3 
successfully  synthesized  filters,  but  to  produce  filters  that  perform 
independently  of  the  input  scene  background,  a  different  input  sampling 
technique  was  emfdoyed  that  used  32  values  computed  finm  wedges  on  a 
targetrcentered  9x9  pucel  grid  as  described  in  Section  5.  Figures  7.14  through 
7.21  show  graphs  of  the  peak-to-clutter  ratio  (in  dB)  versus  target  rotation  (in 
degrees)  for  a  variety  of  cases.  These  results  are  summarized  in  Table  7.3.  The 
inputs  were  32  values  firom  a  target-centered  9x9  pixel  grid  as  described  in 
Section  5.  The  outputs  were  the  600  coded  values  used  to  generate  a  10-60 
bandpass  3x3  superpixel  TPAF.  The  "zero  degree  filter",  "best  expected  filter", 
and  "filter  synthesized  by  network"  curves  have  the  same  meaning  as  in 
Section  7.2.  Performance  depends  on  variations  in  scene  parameters  used  to 
test  the  network  and  is  summarized  in  Table  7.3. 

When  tested  on  the  city70  background  at  angles  of  0,  2, ...,  90  degrees 
(which  were  used  in  training)  the  neural  network  performance  almost  matched 
the  best  expected  performance.  When  different  backgrounds  are  used  at  the 
training  angles  the  network  performance  degraded,  but  not  as  much  as  for  the 


130 


resiilts  discussed  in  Section  7.3.  As  expected,  at  testing  rotation  angles  (1,  3, 
...,  89  degrees)  the  network  performance  degradation  increased.  For  the  32- 
wedge  sampling  technique  the  network  performed  better  compared  with  the 
sampling  technique  discussed  in  Section  7.3  at  interpolating  the  600  coded 
filter  values  for  different  input  scmie  backgrounds  but  worse  for  testing  angles. 
However,  the  network  performance  was  adequate  for  pattern  recognition 
(above  the  3  dB  line)  by  optical  correlation  in  all  but  8  out  of  318  testing  cases. 
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BadKground  Angles  of 
_ Rotation 


Comments 


76 


Training- 

aty70 

0,  2, ....  90 

The  peifermance  of  the  network  matches 
the  best  expected  performance  almost 
exactly. 

Testing- 

City70 

1,  3, 89 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 
Perfinmance  drops  below  the  3  dB  line 
for  rotation  andss  of  37  and  83  degrees. 

TrbkTO 

0,  2, 90 

The  graeral  performance  of  the  network 
follows  the  best  expected  performance 
almost  exactly. 

Trbk70 

li  3| ...« 89 

The  general  performance  of  the  network 
follows  the  best  expected  performance 
almost  exactly. 

NewbkSO 

0,  2, ....  90 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 
Performance  drops  below  the  3  dB  line 
for  a  rotation  angle  of  42  degrees. 

NewbkSO 

1)  3| ...)  89 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 
Performance  drops  below  the  3  dB  line 
for  rotation  angles  of  41  and  45  degrees. 

Bushy 

0,  2,  ....  90 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 

Bushy 

3, 89 

The  general  performance  of  the  network 
follows  the  best  expected  performance. 
Performance  drops  below  the  3  dB  line 
for  rotation  an^es  of  45,  55,  and  83 
degrees. 

Table  7.3 

Shows  the  performance 

of  a  filter  synthesized  by  a  back- 

propagation  neural  network  using  the  32  wedge  sampling 
techmque  and  the  peak-to-clutter  metric  for  a  truck  rotated  on  a 
variety  of  backgrounds. 
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Peak-to-CkiNBr  Ratio  (dB) 


Figure  7.14  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  cityTO  background  at  truck  rotation  angles  of 
0,  2, ....  90  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  dtyTO  backgrotmds  at  truck  rotation  an^es 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  from 

a  target-centered  9x9  pixel  grid. 
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Peakto-Chillsr  Rstfo  (dB) 


Figure  7.15  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  dtyTO  background  at  truck  rotation  anises  of 
1,  3, 89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  cityTO  backgrounds  at  truck  rotation  angles 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  from 

a  target-centered  9x9  pixel  grid. 
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Figure  7.16  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  trbkTO  background  at  truck  rotation  angles 
of  0,  2, 90  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  city70  backgrounds  at  truck  rotation  angles 

of  0,  2 .  90  degrees  using  as  inputs  31  computed  values  from 

a  target-centered  9x9  pixel  grid. 
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Figure  7.17  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  trbk70  background  at  truck  rotation  angles 
of  1, 3, 89  degrees.  The  netwoi^  was  trained  on  66  gray  level, 
142  gray  level,  and  dtyTO  badcgrounds  at  truck  rotation  an^es 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  from 

a  target-centered  9x9  pixel  grid. 
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Peak  to-Ckidw  Rato  (dB) 


Figure  7.18  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  newbkSO  background  at  truck  rotation  an^es 
of  0, 2, ....  90  degrees.  The  network  was  trained  on  66  gray  levd, 
142  gray  level,  and  dtyTO  backgrounds  at  truck  rotation  an^es 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  from 

a  target-centered  9x9  pixd  grid. 
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Figure  7.19  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  newbkSO  background  at  truck  rotation  an^es 
of  1, 3, ....  89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  levd,  and  d^O  badcgrounds  at  truck  rotation  angles 
of  0, 2, ...,  90  degrees  usmg  as  inputs  31  computed  values  firom  a 
target-centered  9x9  pixel  grid. 


In-PImm  RoMion  of  Truck  Taigoi  (dog) 


f  igure  7.20  Filter  synthesis  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  bushy  background  at  truck  rotation  an^es  of 
0,  2, 90  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  dty70  backgrounds  at  truck  rotation  anises 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  from 
a  target-centered  9x9  pixel  grid. 
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Figure  7.21  Filter  eynthesie  results  using  a  back-propagation  neural  network 
for  the  truck  on  the  bushy  background  at  truck  rotation  angles  of 
1,  3, ...»  89  degrees.  The  network  was  trained  on  66  gray  level, 
142  gray  level,  and  cityTO  backgrounds  at  truck  rotation  an£^Aa 
of  0,  2, ...,  90  degrees  using  as  inputs  31  computed  values  firom 
a  target-centered  9x9  pixel  grid. 


7.5.  Stretch  aiid  hammer  neural  networic  results 
(32  wedges,  peak-to-dutter) 

Using  binarized  correlator  inputs  and  a  TPAF  in  the  filter  plane,  a 
stretch  and  bamiriftr  neural  network  with  32  inputs  (computed  firom  the  gray- 
level  input  image)  and  600  outputs  successfully  qmthesized  filters.  The  neural 
network  was  trained  on  a  set  of  138  input  scenes  of  128  by  128  pixels  as 
described  in  Section  7.3. 

The  results  discussed  in  Section  7.4  indicate  performance  independent 
of  the  backgroimd,  however,  there  was  a  slight  degradation  in  performance  at 
angles  not  included  in  training.  This  effect  is  expected  since  the  wedge 
sampling  technique  is  sensitive  to  rotation  an^es.  For  comparison,  a  stretch 
and  hammer  neural  network  was  trained  using  the  same  wedge  inputs. 
Figures  7.22  through  7.29  show  graphs  of  the  peak-to-clutter  ratio  (in  dB) 
versus  target  rotation  (in  degrees)  for  a  variety  of  cases.  These  results  are 
summarized  in  Table  7.4.  The  inputs  were  32  values  from  a  target-centered  9 
X  9  pixel  grid  as  described  in  Section  5.  The  outputs  were  the  600  coded  values 
used  to  generate  a  10-60  bandpass  3x3  superpixel  TPAF.  The  "zero  degree 
filter",  "best  expected  filter",  and  "filter  synthesized  by  network"  curves  have 
the  same  meaning  as  in  Section  7.2.  Performance  depends  on  variations  in 
scene  parameters  used  to  test  the  network  and  therefore  is  summarized  in 
Table  7.4. 

When  tested  on  the  cit770  background  at  angles  of  0,  2, ...,  90  degrees 
(which  were  used  in  training)  the  neural  network  performance,  by  definition, 
matched  the  best  expected  performance  exactly.  At  training  angles  the  network 
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perfonnance  degraded  sH^tly  for  the  trbk70  background  and  more  so  for  the 
newbkSO  and  bushy  backgrounds.  At  testing  rotation  angles  (1,  3,  89 

degrees)  the  neural  network  performance  was  adequate  for  pattern  recognition 
(above  the  3  dB  line)  by  optical  correlation  in  all  but  42  out  of  318  testing 
cases. 
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Background 

Angies  of 
Rotation 

Comments 

Training- 

CityTO 

Testing- 

aty70 

0,  2, ....  90 

1,  3, ....  89 

The  p^fonnance  of  the  network  matches 
the  best  expected  performance  exactly. 

The  performance  of  the  network  drops 
below  the  3  dB  line  in  8  out  of  45  tests, 
filters  were  successfully  synthesized 
approximately  80%  of  the  time. 

TrbkTO 

0,  2, ...,  90 

The  general  performance  of  the  network 
follows  the  best  ejpected  performance. 

Trbk70 

1,  3, ....  89 

The  general  performance  of  the  network 
is  adequate  for  pattern  recognitiozL 
Performance  drops  below  the  3  dB  line 
for  rotation  angles  of  43  and  75  degrees. 

NewbkSO 

0,  2, ....  90 

The  general  performance  of  the  network 
is  adequate  for  pattern  recognition. 
Performance  drops  below  the  3  dB  line 
for  rotation  angles  of 24, 40, 42,  and  46 
degrees. 

NewbkSO 

1,  3,  ...,  89 

The  performance  of  the  network  drops 
below  the  3  dB  line  in  8  out  of  45  tests, 
filters  were  successfully  synthesized 
approximately  80%  of  the  time. 

Bushy 

0,  2, ...,  90 

The  performance  of  the  network  drops 
below  the  3  dB  fine  in  8  out  of  46  tests, 
filters  were  successfully  synthesized 
approximately  80%  of  the  time. 

Bushy 

1,  3 . 89 

The  performance  of  the  network  drops 

below  the  3  dB  line  in  13  out  of  45  tests, 
filters  were  successfully  synthesized 
apinrozixnately  65%  of  the  time. 


Table  7.4  Shows  the  performance  of  a  filter  synthesized  by  a  stretch  and 
hammer  neural  network  using  the  32  wedge  sampling  technique 
and  the  peak-to^utter  metric  for  a  truck  on  a  variety  of 
backgrounds. 
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Paak-to-duMr  Ratio  (dB) 


livPtm  Roaiion  of  Tivck  Taivr  (dag) 


Figure  7.22  Fflter  synthesis  results  iising  a  stretch  and  hammer  neural 

network  for  the  truck  on  the  ci^O  background  at  truck  rotation 
an^es  of  0, 2, 90  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  level,  and  cityTO  backgrounds  at  truck  rotation 
ancles  of  0,  2, 90  degrees  using  as  inputs  31  computed  values 
from  a  target-centered  9x9  pixel  grid. 
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Figure  7.23  Filter  synthesis  results  using  a  stretch  and  hammer  neural 

network  for  the  truck  on  the  dty70  background  at  truck  rotation 
an^es  of  1, 3, 89  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  levd,  and  dlyTO  backgrounds  at  truck  rotation 
angles  of  0, 2, 90  degrees  using  as  inputs  31  computed  values 
from  a  target-centered  9x9  pixel  grid. 
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Peak-lo-Clultor  Ratfo  (dB) 


Figure  7.24  Filter  synthesis  results  using  a  stretch  and  hammer  neural 

network  for  the  truck  on  the  trbkTO  background  at  truck  rotation 
an^es  of  0, 2, 90  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  levd,  and  dlyTO  backgrounds  at  truck  rotation 
an^es  of  0,  2, ....  90  degrees  using  as  inputs  31  computed  values 
from  a  target-centered  9x9  pixel  grid. 
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Peak-to-ChJltor  Ratio  (dB) 


In-Plana  Rotation  of  Tnick  Taigat  (dog) 


Figure  7.25  Filter  synthesis  results  using  a  stretch  and  hammer  neural 

network  for  the  truck  on  the  trhk70  background  at  truck  rotation 
an^es  of  1, 3, 89  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  level,  and  city70  backgrounds  at  truck  rotation 
an^es  of  0, 2, ...,  90  degrees  using  as  inputs  31  computed  values 
from  a  target-centered  9x9  pixel  grid. 
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Figure  7.26  Filter  synthesie  results  using  a  stretch  and  hammer  neural 
network  for  the  truck  on  the  newbkSO  background  at  truck 
rotation  angles  of  0,  2, 90  degrees.  The  network  was  trained 
on  66  gray  level,  142  gray  levd,  and  dtyTO  backgrounds  at  truck 
rotation  angles  of  0, 2, ...,  90  degrees  using  as  inputs  31  computed 
values  from  a  target^ntered  9x9  pixel  grid. 


Paak-kM^tultor  Ratio  (dB) 


Figure  7.27  Filter  synthesis  results  using  a  stretch  and  hanuner  neural 
network  for  the  truck  (m  the  newbkSO  background  at  truck 
rotation  anises  of  1,  3, ...»  89  degrees.  The  network  was  trained 
on  66  gray  level,  142  gray  levd,  and  city70  backgrounds  at  truck 
rotation  angles  of  0»  2» ...» 90  degrees  using  as  inputs  31  computed 
values  from  a  target-centered  9x9  pixel  grid. 
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Poik  to-Ckinar  Ratio  (dB) 


Figure  7.28  Filter  synthesis  results  using  a  stretch  and  hanuner  neural 

network  for  the  truck  on  the  bushy  background  at  truck  rotation 
an^es  of  0, 2, ...» 90  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  level,  and  cityTO  backgrounds  at  truck  rotation 
anj^es  of  0,  2, ...,  90  d^rees  using  as  inputs  31  computed  values 
firom  a  target-centered  9x9  pizd  grid. 
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Paak-lo-Clultor  Ratio  (dB) 
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Figure  7.29  Filter  synthesis  results  using  a  stretch  and  hanuner  neural 

network  for  the  truck  on  the  bushy  background  at  truck  rotation 
an^es  of  1, 3, ...» 89  degrees.  The  network  was  trained  on  66  gray 
level,  142  gray  levd,  and  dtyTO  backgrounds  at  truck  rotation 
ancles  of  0,  2, £10  degrees  using  as  inputs  31  computed  values 
fiK>m  a  target^ntered  9  z  9  pixd  grid. 
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7.6.  Compariaon  of  back-propagation  and  atretch  and  hammer  neural 
network  reaulta 

Both  the  stretch  and  hammer  neural  network  and  the  back-propagation 
neural  network  perform  a  mapping  fmm  the  input  space  to  the  output  space. 
The  stretch  and  hammer  neural  network  uses  a  deformed  least  squares  plane 
of  dimenei<mality  equal  to  the  number  of  inputs  to  map  the  inputs  to  the 
outputs.  The  back-propagation  neural  network  also  uses  a  function  of 
dimensionality  equal  to  the  number  of  inputs.  However,  this  function  is  created 
by  reducing  the  sum  of  the  squared  differences  between  the  network  outputs 
and  the  training  outputs.  This  mapping  function  is  deformed  or  altered  by 
changing  the  values  of  the  interconnection  wei^ts. 

The  results  show  that  the  stretch  and  hammer  neural  network  did  not 
perform  as  wdl  as  the  back-propagation  neural  network  using  the  same  input/ 
output  training  set.  However,  the  stretch  and  hammer  neural  network  had  a 
total  of  138  •»>  31  +  1  s  170  adjustable  parameters  for  each  output  coded  filter 
value,  whereas  the  back-propagation  neural  network  had  0.5(  31  z  200  +  200 
+  1)  s  3300.5  adjustable  parameters  for  each  such  value.  Thus  the  back- 
propagation  neural  network  had  an  advantage  of  approximately  17  times  in 
the  number  of  adjustable  parameters  available  for  characterizing  the  rules  for 
mapping  inputs  to  outputs.  However,  the  3300.5  adjustable  parameters  were 
not  independent  because,  as  indicated  in  Section  6.4,  the  average  number  of 
adjustable  parameters  per  output  for  the  back-propagation  neural  network  was 
63,500/600  s  105.8.  Current  software  limits  the  number  of  adjustable 
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parameters  for  the  stretch  and  hammer  neural  network,  but  if  this  number 
could  be  increased  to  3300  it  is  reasonable  to  expect  that  this  network  would 
have  comparable  or  superior  performance  compared  to  the  back-propagation 
neural  network. 
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8.  SUMMARY  AND  CONCLUSION 


The  feasibility  of  using  neural  networks  to  synthesize  filters  for  a  HAC 
system  was  demonstrated.  Both  stretdi  and  hammer  and  back-propagation 
neural  networks  successfully  synthesized  filters.  The  stretch  and  hammer 
neural  network  trained  more  rapidly  and  had  the  advantage  of  exact  learning. 
For  the  filter  synthesis  results  obtained  here  the  back-propagation  neural 
network  performed  better  than  the  stretch  and  hammer  neural  network, 
although  the  back-propagation  neural  network  required  approximately  40 
hours  to  train  on  a  486-dass  33  MHz  desktop  computer,  whereas  the  stretch 
and  hammer  neural  network  required  approximately  3  hours.  However, 
training  time  and  performance  depended  on  the  number  of  training  examples 
and  the  number  of  hidden  neurons.  In  a  simple  test  where  these  variables  were 
selected  so  that  the  number  of  independent  adjustable  parameters  (or  neural 
network  weights)  were  the  same,  the  stretch  and  hammer  neural  networir  both 
trained  more  rapidly  and  out-performed  the  back-propagation  neural  network. 

A  relatively  new  input  scene  binarization  technique  was  used.  This 
technique  involved  edge-enhancement  prior  to  thresholding  to  retain  more 
target  information,  thus  providing  a  sufficient  signal  (target)  to  noise 
(background)  ratio  for  pattern  recognition  for  a  variety  of  cluttered 
backgrounds. 
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TPAF  filters  were  used  for  correlation.  A  filter  was  generated  by 
hinariging  a  target  using  the  same  technique  as  for  input  scene  binarization, 
performing  a  computer  simulated  Fourier  transform  on  this  binarized  target 
(using  TLA  »  0  degrees  to  create  a  symmetric  cosine  BPOF),  averaging  3x3 
inxel  grids,  and  multiplying  the  resulting  3x3  superpixel  BPOF  by  a  10-60 
pixel  radius  baiidpass  to  create  a  TPAF  with  only  600  coded  values.  The  neural 
networirs  used  in  this  research  were  trained  to  produce  these  600  values  as 
outputs. 

The  training  inputs  to  the  neural  networks  were  values  determined  firom 
the  input  scene.  An  input  scene  sampling  strategy  was  developed  that  provided 
a  sufficient  amount  of  representative  information  to  the  neural  networks.  A 
variety  of  sampling  strategies  were  analyzed.  The  strategy  that  most 
successfully  ignored  background  effects  was  a  32  wedge  input  scene  sampling 
technique.  Sampling  the  input  scene  using  target-centered  sampling  grids 
implies  previous  knowledge  of  target  position.  This  restriction  limits  the 
applicability  of  filter  s3mthesis  to  target  tracking  or  target  confirmation  tasks. 
Advances  have  been  made  in  blob  recognition  which  may  enable  neural 
network  filter  synthesis  to  be  used  for  target  recognition.  If  after  binarization 
the  center  of  a  blob  can  be  found  (representing  the  center  of  the  target  in  the 
input  scene),  then  it  should  be  possible  to  use  an  input  plane  sampling  gri(<  ru  - 
neural  network  filter  synthesis. 

Pattern  recognition  by  optical  correlation  relies  on  using  the  properties 
of  the  Fourier  spectrum  of  a  target.  Unfortunately,  the  Fourier  spectrum  varies 
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when  the  target  undergoes  rotation,  scale,  aspect,  or  other  distortions.  A  sin^e 
filter  can  successfully  correlate  only  a  limited  range  of  these  distortions.  Thus 
successful  correlation  requires  a  lai^e  number  of  filters  to  accommodate  all 
posfflUe  orientations  and  scalings  of  a  target  This  research  shows  how  neural 
network  filter  synthesis  can  accommodate  in-plane  rotation  distortions.  This 
filter  synthesis  approach  may  be  feasibie  for  accommodating  other  distortions 
if  adequate  input/  output  training  sets  are  used. 

Using  a  larger  filter  bandpass  should  allow  for  improved  neural  network 
filter  synthesis  performance,  because  more  neural  network  outputs  could  be 
employed.  Correlation  perfonnance  could  also  be  improved  by  using  every  pixel 
to  synthesize  filters.  Using  every  pixel  also  would  require  nine  times  more 
neural  network  outputs,  but  this  increase  could  be  accommodated  by  employing 
more  than  one  network  to  produce  the  coded  filter  values.  Correlation 
performance  could  also  be  improved  by  training  a  separate  neural  network  to 
learn  a  complex  binary  amplitude  pattern  of  the  TPAF  filter. 
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APPENDIX 


Figure  A.1 


Truck  image. 
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Figure  A.2 


CityTO  background. 
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COMPARISON  OF  RADIAL  BASIS  FUNCTION 
AND  CARDINAL  CUBIC  SPLINE  INTERPOLATION 


Stevoi  C  Gustafson,  Troy  A.  Rhoadarmer,  John  S.  Loomis, 
and  Gordon  R.  Little 
University  of  Dayton,  300  College  Paiic, 
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A  key  problem  in  the  implementation  of  radial  basis  function  neural  networks  (e.g.. 
Moody  and  Daricen,  1987)  is  the  determination  of  basis  function  widths.  A  diagonal  dominance 
technique  has  been  described  (Gustafson,  et  aL  1992a,b)  and  demonstrated  (Gilbert  and 
Gustafson,  1993)  that  determines  these  widths  by  relating  them  to  neural  network  stability,  Le., 
the  widths  are  selected  so  that  small  changes  in  the  training  data  yield  small  changes  in  the 
neural  network  weights.  Here  this  technique  is  investigated  for  one-variable  (single-input) 
functions  by  comparing  Gaussian  radial  basis  function  interpolation  with  cardinal  cubic  spline 
interpolation,  which  is  the  smoothest  possible  interpolation  according  to  the  least  integrated 
squared  second  derivative  criterion  required  by  regularization  theory  (e.g.,  Poggio  and  Girosi, 
1990). 


A  Gaussian  radial  basis  function  neural  network  interpolates  m  training  points  (xj,  yj) 
using  f(x)  s  £jCjexp[-(x  -  xj)^/Oj^],  where  the  aj  are  basis  function  widths  and  the  cj  are  neural 
network  weights.  Once  the  Oj  are  determined  the  cj  are  fouiul  by  solving  m  simultaneous  linear 
equations  in  m  unknowns  y-,  =  LjCjajj,  where  ajj  =  exp[-(xi  -  Xj)2/2oj^].  The  diagonal  dominance 
technique  selects  the  Oj  such  that  the  matrix  A  %  {aij}  is  diagonally  dominant  by  an  amount  e  for 
each  column,  i.e.,  so  that  £  =  aQ  -  for  all  j.  It  is  well  known  that  the  oo-norm  of  A  is  IIAIleo 
=  maxiZjIaijI  and  that  the  eo-norm  and  the  2-norm  are  related  by  IIAII2  ^  Vin  IIAII,,.  Also,  it  has 
been  shown  that  the  «>-norm  of  A*^  is  IIA*Ml,„  ^  1/e  (Varah,  1975).  Using  ajj  =  1,  the  2-norm 
condition  number  of  A  is  thus  <2  =  IIAIl2llA*Ml2  ^  m(2  -  e)/e.  Finally,  stability  defined  by  l/r^  is 
bounded  by  l/r^  >  [(K2ry)*l  -  l]/2  for  <2  Ty  <  1,  where  r^  =  [Ij(Cj'  -  is  the  firactional 

root-mean-square  coefficient  change,  ry=  [2i(yj’  -  is  the  fractional  root-mean- 

square  data  output  change,  and  the  Cj  change  to  q'  if  the  yj  change  to  yi'  (e.g.,  Golub  and  Van 
Loan,  1989).  Thus  for  positive  e  the  diagonal  dominance  technique  ensures  a  bound  on  neural 
network  stability.  This  technique  (Gustafson,  et  al.  1992a)  has  been  suggested  for  (Gustafson  et 
al.,  1992b)  and  successfully  demonstrated  in  (Gilbert  and  Gustafson 1993)  image  processing 
applications.  For  a  specified  e  determination  of  the  cj  using  the  diagonal  dominance  technique 
requires  the  solution  of  m  independent  nonlinear  equations  ear’  one  unknown,  i.e.,  the 
solution  of  e  =  ay  -  for  Oj  is  required  for  all  j. 
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Figure  It  shows  points  between  x  «  -3  and  3  for  an  iiiq>ulse  function  at  x  «  0  that  is 
nuidomly  sampled  at  31  points,  where  the  maxinuun  and  minimum  points  satisfy  1x1  >  7.  A 

e  »  0  Gaussian  radial  basis  function  curve  and  a  cardinal  cubic  spline  curve  that  interpolate  these 
31  points  are  also  shown,  where  the  spline  curve  is  cardinal  because  it  is  independent  of  the 
(distant)  end  conditions.  Hgure  lb  is  a  plot  of  die  root-mean-square  difference  between  diese 
curves  firmn  x  »  -3  to  3  as  a  function  of  e;  it  indicates  diat  £  >  -0.19  yields  the  optimum 
agreement  (a  plot  of  maximum  absolute  difference  has  nearly  the  same  minimum).  Figure  Ic  is 
a  histogram  of  e  values  that  yield  the  optimum  agreement  for  many  sets  of  31  random  training 
points,  where  the  x  values  are  uniformly  random  and  extend  from  x  <  -7  to  x  >  7  and  where  the 
y  values  are  uniformly  random  from  0  to  1  between  x  »  -3  and  3  aiul  are  constant  from  the 
minimum  and  maximum  x  within  this  range  to  x  <  -7  and  x  >  7,  respectively. 

These  results  indicate  that  Gaussian  radial  basis  function  neural  networks  should  have 
basis  function  widths  determined  by  diagonal  dominance  with  a  positive  e  as  close  to  zero  as 
acceptable  neural  network  stability  permits.  Hgure  Ic  indicates  that  such  neural  networks  are 
most  likely  to  yield  interpoladon  curves  in  optimal  agreement  with  the  maximum-smoothness 
interpolation  curves  determined  by  regularizadon  or  spline  methods.  However,  unlike  these 
methods,  radial  basis  function  techniques  are  readily  applicable  to  neural  networks  with  multiple 
inputs  and  nonuniform  training  points. 
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Figure  1.  (a).  Randomly  sampled  impulse  function  with  e  »  0  Gaussian  radial  basis 

function  (broken)  and  cardinal  cubic  spline  (solid)  interpolation  curves,  (b).  Plot 
of  root-mean-square  cUfference  between  these  curves  as  a  function  of  diagonal 
dominances,  (c).  Histogram  of  e  values  that  yield  minimum  root-mean-square 
differences  between  Gaussian  radial  basis  function  and  cardinal  cubic  spline 
interpolation  curves  for  random  training  points. 
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